Search Results: "voc"

16 August 2023

Sam Hartman: A First Exercise with AI Training

Taking a hands-on low-level approach to learning AI has been incredibly rewarding. I wanted to create an achievable task that would motivate me to learn the tools and get practical experience training and using large language models. Just at the point when I was starting to spin up GPU instances, Llama2 was released to the public. So I elected to start with that model. As I mentioned, I m interested in exploring how sex-positive AI can help human connection in positive ways. For that reason, I suspected that Llama2 might not produce good results without training: some of Meta s safety goals run counter to what I m trying to explore. I suspected that there might be more attention paid to safety in the chat variants of Llama2 rather than the text generation variants, and working against that might be challenging for a first project, so I started with Llama-2-13b as a base. Preparing a Dataset I elected to generate a fine tuning dataset using fiction. Long term, that might not be a good fit. But I ve always wanted to understand how an LLM s tone is adjusted how you get an LLM to speak in a different voice. So much of fine tuning focuses on examples where a given prompt produces a particular result. I wanted to understand how to bring in data that wasn t structured as prompts. The Huggingface course actually gives an example of how to adjust a model set up for masked language modeling trained on wikitext to be better at predicting the vocabulary of movie reviews. There though, doing sample breaks in the dataset at movie review boundaries makes sense. There s another example of training an LLM from scratch based on a corpus of python code. Between these two examples, I figured out what I needed. It was relatively simple in retrospect: tokenize the whole mess, and treat everything as output. That is, compute loss on all the tokens. Long term, using fiction as a way to adjust how the model responds is likely to be the wrong starting point. However, it maximized focus on aspects of training I did not understand and allowed me to satisfy my curiosity. Rangling the Model I decided to actually try and add additional training to the model directly rather than building an adapter and fine tuning a small number of parameters. Partially this was because I had enough on my mind without understanding how LoRA adapters work. Partially, I wanted to gain an appreciation for the infrastructure complexity of AI training. I have enough of a cloud background that I ought to be able to work on distributed training. (As it turned out, using BitsAndBytes 8-bit optimizer, I was just able to fit my task onto a single GPU). I wasn t even sure that I could make a measurable difference in Llama-2-13b running 890,000 training tokens through a couple of training epochs. As it turned out I had nothing to fear on that front. Getting everything to work was more tricky than I expected. I didn t have an appreciation for exactly how memory intensive training was. The Transformers documentation points out that with typical parameters for mixed-precision training, it takes 18 bytes per model parameter. Using bfloat16 training and an 8-bit optimizer was enough to get things to fit. Of course then I got to play with convergence. My initial optimizer parameters caused the model to diverge, and before I knew it, my model had turned to NAN, and would only output newlines. Oops. But looking back over the logs, watching what happened to the loss, and looking at the math in the optimizer to understand how I ended up getting something that rounded to a divide by zero gave me a much better intuition for what was going on. The results. This time around I didn t do anything in the way of quantitative analysis of what I achieved. Empirically I definitely changed the tone of the model. The base Llama-2 model tends to steer away from sexual situations. It s relatively easy to get it to talk about affection and sometimes attraction. Unsurprisingly, given the design constraints, it takes a bit to get it to wonder into sexual situations. But if you hit it hard enough with your prompt, it will go there, and the results are depressing. At least for prompts I used, it tended to view sex fairly negatively. It tended to be less coherent than with other prompts. One inference managed to pop out in the middle of some text that wasn t hanging together well, Chapter 7 - Rape. With my training, I did manage to achieve my goal of getting the model to use more positive language and emotional signaling when talking about sexual situations. More importantly, I gained a practical understanding of many ways training can go wrong. A lot of articles I ve been reading about training make more sense. I have better intuition for why you might want to do training a certain way, or why mechanisms for countering some problem will be important. Future Activities:

comment count unavailable comments

9 August 2023

Antoine Beaupr : OpenPGP key transition

This is a short announcement to say that I have changed my main OpenPGP key. A signed statement is available with the cryptographic details but, in short, the reason is that I stopped using my old YubiKey NEO that I have worn on my keyring since 2015. I now have a YubiKey 5 which supports ED25519 which features much shorter keys and faster decryption. It allowed me to move all my secret subkeys on the key (including encryption keys) while retaining reasonable performance. I have written extensive documentation on how to do that OpenPGP key rotation and also YubiKey OpenPGP operations.

Warning on storing encryption keys on a YubiKey People wishing to move their private encryption keys to such a security token should be very careful as there are special precautions to take for disaster recovery. I am toying with the idea of writing an article specifically about disaster recovery for secrets and backups, dealing specifically with cases of death or disabilities.

Autocrypt changes One nice change is the impact on Autocrypt headers, which are considerably shorter. Before, the header didn't even fit on a single line in an email, it overflowed to five lines:
Autocrypt: addr=anarcat@torproject.org; prefer-encrypt=nopreference;
 keydata=xsFNBEogKJ4BEADHRk8dXcT3VmnEZQQdiAaNw8pmnoRG2QkoAvv42q9Ua+DRVe/yAEUd03EOXbMJl++YKWpVuzSFr7IlZ+/lJHOCqDeSsBD6LKBSx/7uH2EOIDizGwfZNF3u7X+gVBMy2V7rTClDJM1eT9QuLMfMakpZkIe2PpGE4g5zbGZixn9er+wEmzk2mt20RImMeLK3jyd6vPb1/Ph9+bTEuEXi6/WDxJ6+b5peWydKOdY1tSbkWZgdi+Bup72DLUGZATE3+Ju5+rFXtb/1/po5dZirhaSRZjZA6sQhyFM/ZhIj92mUM8JJrhkeAC0iJejn4SW8ps2NoPm0kAfVu6apgVACaNmFb4nBAb2k1KWru+UMQnV+VxDVdxhpV628Tn9+8oDg6c+dO3RCCmw+nUUPjeGU0k19S6fNIbNPRlElS31QGL4H0IazZqnE+kw6ojn4Q44h8u7iOfpeanVumtp0lJs6dE2nRw0EdAlt535iQbxHIOy2x5m9IdJ6q1wWFFQDskG+ybN2Qy7SZMQtjjOqM+CmdeAnQGVwxowSDPbHfFpYeCEb+Wzya337Jy9yJwkfa+V7e7Lkv9/OysEsV4hJrOh8YXu9a4qBWZvZHnIO7zRbz7cqVBKmdrL2iGqpEUv/x5onjNQwpjSVX5S+ZRBZTzah0w186IpXVxsU8dSk0yeQskblrwARAQABzSlBbnRvaW5lIEJlYXVwcsOpIDxhbmFyY2F0QHRvcnByb2plY3Qub3JnPsLBlAQTAQgAPgIbAwULCQgHAwUVCgkICwUWAgMBAAIeAQIXgBYhBI3JAc5kFGwEitUPu3khUlJ7dZIeBQJihnFIBQkacFLiAAoJEHkhUlJ7dZIeXNAP/RsX+27l9K5uGspEaMH6jabAFTQVWD8Ch1om9YvrBgfYtq2k/m4WlkMh9IpT89Ahmlf0eq+V1Vph4wwXBS5McK0dzoFuHXJa1WHThNMaexgHhqJOs
 S60bWyLH4QnGxNaOoQvuAXiCYV4amKl7hSuDVZEn/9etDgm/UhGn2KS3yg0XFsqI7V/3RopHiDT+k7+zpAKd3st2V74w6ht+EFp2Gj0sNTBoCdbmIkRhiLyH9S4B+0Z5dUCUEopGIKKOSbQwyD5jILXEi7VTZhN0CrwIcCuqNo7OXI6e8gJd8McymqK4JrVoCipJbLzyOLxZMxGz8Ki0b9O844/DTzwcYcg9I1qogCsGmZfgVze2XtGxY+9zwSpeCLeef6QOPQ0uxsEYSfVgS+onCesSRCgwAPmppPiva+UlGuIMun87gPpQpV2fqFg/V8zBxRvs6YTGcfcQjfMoBHmZTGb+jk1//QAgnXMO7fGG38YH7iQSSzkmodrH2s27ZKgUTHVxpBL85ptftuRqbR7MzIKXZsKdA88kjIKKXwMmez9L1VbJkM4k+1Kzc5KdVydwi+ujpNegF6ZU8KDNFiN9TbDOlRxK5R+AjwdS8ZOIa4nci77KbNF9OZuO3l/FZwiKp8IFJ1nK7uiKUjmCukL0od/6X2rJtAzJmO5Co93ZVrd5r48oqUvjklzzsBNBFmeC3oBCADEV28RKzbv3dEbOocOsJQWr1R0EHUcbS270CrQZfb9VCZWkFlQ/1ypqFFQSjmmUGbNX2CG5mivVsW6Vgm7gg8HEnVCqzL02BPY4OmylskYMFI5Bra2wRNNQBgjg39L9XU4866q3BQzJp3r0fLRVH8gHM54Jf0FVmTyHotR/Xiw5YavNy2qaQXesqqUv8HBIha0rFblbuYI/cFwOtJ47gu0QmgrU0ytDjlnmDNx4rfsNylwTIHS0Oc7Pezp7MzLmZxnTM9b5VMprAXnQr4rewXCOUKBSto+j4rD5/77DzXw96bbueNruaupb2Iy2OHXNGkB0vKFD3xHsXE2x75NBovtABEBAAHCwqwEGAEIACAWIQSNyQHOZBRsBIrVD7t5IVJSe3WSHgUCWZ4LegIbAgFACRB5IV
 JSe3WSHsB0IAQZAQgAHRYhBHsWQgTQlnI7AZY1qz6h3d2yYdl7BQJZngt6AAoJED6h3d2yYdl7CowH/Rp7GHEoPZTSUK8Ss7crwRmuAIDGBbSPkZbGmm4bOTaNs/gealc2tsVYpoMx7aYgqUW+t+84XciKHT+bjRv8uBnHescKZgDaomDuDKc2JVyx6samGFYuYPcGFReRcdmH0FOoPCn7bMW5mTPztV/wIA80LZD9kPKIXanfUyI3HLP0BPwZG4WTpKzJaalR1BNwu2oF6kEK0ymH3LfDiJ5Sr6emI2jrm4gH+/19ux/x+ST4tvm2PmH3BSQOPzgiqDiFd7RZoAIhmwr3FW4epsK9LtSxsi9gZ2vATBKO1oKtb6olW/keQT6uQCjqPSGojwzGRT2thEANH+5t6Vh0oDPZhrKUXRAAxHMBNHEaoo/M0sjZo+5OF3Ig1rMnI6XbKskLv6hu13cCymW0w/5E4XuYnyQ1cNC3pLvqDQbDx5mAPfBVHuqxJdRLQ3yDM/D2QIsxnkzQwi0FsJuni4vuJzWK/NHHDCvxMCh0YmSgbptUtgW8/niatd2Y6MbfRGxUHoctKtzqzivC8hKMTFrj4AbZhg/e9QVCsh5zSXtpWP0qFDJsxRMx0/432n9d4XUiy4U672r9Q09SsynB3QN6nTaCTWCIxGxjIb+8kJrRqTGwy/PElHX6kF0vQUWZNf2ITV1sd6LK/s/7sH+x4rzgUEHrsKr/qPvY3rUY/dQLd+owXesY83ANOu6oMWhSJnPMksbNa4tIKKbjmw3CFIOfoYHOWf3FtnydHNXoXfj4nBX8oSnkfhLILTJgf6JDFXfw6mTsv/jMzIfDs7PO1LK2oMK0+prSvSoM8bP9dmVEGIurzsTGjhTOBcb0zgyCmYVD3S48vZlTgHszAes1zwaCyt3/tOwrzU5JsRJVns+B/TUYaR/u3oIDMDygvE5ObWxXaFVnCC59r+zl0FazZ0ouyk2AYIR
 zHf+n1n98HCngRO4FRel2yzGDYO2rLPkXRm+NHCRvUA/i4zGkJs2AV0hsKK9/x8uMkBjHAdAheXhY+CsizGzsKjjfwvgqf84LwAzSDdZqLVE2yGTOwU0ESiArJwEQAJhtnC6pScWjzvvQ6rCTGAai6hrRiN6VLVVFLIMaMnlUp92EtgVSNpw6kANtRTpKXUB5fIPZVUrVdfEN06t96/6LE42tgifDAFyFTZY5FdHHri1GG/Cr39MpW2VqCDCtTTPVWHTUlU1ZG631BJ+9NB+ce58TmLr6wBTQrT+W367eRFBC54EsLNb7zQAspCn9pw1xf1XNHOGnrAQ4r9BXhOW5B8CzRd4nLRQwVgtw/c5M/bjemAOoq2WkwN+0mfJe4TSfHwFUozXuN274X+0Gr10fhp8xEDYuQM0qu6W3aDXMBBwIu0jTNudEELsTzhKUbqpsBc9WjwNMCZoCuSw/RTpFBV35mXbqQoQgbcU7uWZslLl9Wvv/C6rjXgd+GeX8SGBjTqq1ZkTv5UXLHTNQzPnbkNEExzqToi/QdSjFMIACnakeOSxc0ckfnsd9pfGv1PUyPyiwrHiqWFzBijzGIZEHxhNGFxAkXwTJR7Pd40a7RDxwbO6p/TSIIum41JtteehLHwTRDdQNMoyfLxuNLEtNYS0uR2jYI1EPQfCNWXCdT2ZK/l6GVP6jyB/olHBIOr+oVXqJh+48ki8cATPczhq3fUr7UivmguGwD67/4omZ4PCKtz1hNndnyYFS9QldEGo+AsB3AoUpVIA0XfQVkxD9IZr+Zu6aJ6nWq4M2bsoxABEBAAHCwXYEGAEIACACGwwWIQSNyQHOZBRsBIrVD7t5IVJSe3WSHgUCWPerZAAKCRB5IVJSe3WSHkIgEACTpxdn/FKrwH0/LDpZDTKWEWm4416l13RjhSt9CUhZ/Gm2GNfXcVTfoF/jKXXgjHcV1DHjfLUPmPVwMdqlf5ACOiFqIUM2ag/OEARh356w
 YG7YEobMjX0CThKe6AV2118XNzRBw/S2IO1LWnL5qaGYPZONUa9Pj0OaErdKIk/V1wge8Zoav2fQPautBcRLW5VA33PH1ggoqKQ4ES1hc9HC6SYKzTCGixu97mu/vjOa8DYgM+33TosLyNy+bCzw62zJkMf89X0tTSdaJSj5Op0SrRvfgjbC2YpJOnXxHr9qaXFbBZQhLjemZi6zRzUNeJ6A3Nzs+gIc4H7s/bYBtcd4ugPEhDeCGffdS3TppH9PnvRXfoa5zj5bsKFgjqjWolCyAmEvd15tXz5yNXtvrpgDhjF5ozPiNp/1EeWX4DxbH2i17drVu4fXwauFZ6lcsAcJxnvCA28RlQlmEQu/gFOx1axVXf6GIuXnQSjQN6qJbByUYrdc/cFCxPO2/lGuUxnufN9Tvb51Qh54laPgGLrlD2huQeSD9Sxa0MNUjNY0qLqaReT99Ygb2LPYGSLoFVx9iZz6sZNt07LqCx9qNgsJwsdmwYsNpMuFbc7nkWjtlEqzsXZHTvYN654p43S+hcAhmmOzQZcew6h71fAJLciiqsPBnCEdgCGFAWhZZdPkMA==
After the change, the entire key fits on a single line, neat!
Autocrypt: addr=anarcat@torproject.org; prefer-encrypt=nopreference;
 keydata=xjMEZHZPzhYJKwYBBAHaRw8BAQdAWdVzOFRW6FYVpeVaDo3sC4aJ2kUW4ukdEZ36UJLAHd7NKUFudG9pbmUgQmVhdXByw6kgPGFuYXJjYXRAdG9ycHJvamVjdC5vcmc+wpUEExYIAD4WIQS7ts1MmNdOE1inUqYCKTpvpOU0cwUCZHZgvwIbAwUJAeEzgAULCQgHAwUVCgkICwUWAgMBAAIeAQIXgAAKCRACKTpvpOU0c47SAPdEqfeHtFDx9UPhElZf7nSM69KyvPWXMocu9Kcu/sw1AQD5QkPzK5oxierims6/KUkIKDHdt8UcNp234V+UdD/ZB844BGR2UM4SCisGAQQBl1UBBQEBB0CYZha2IMY54WFXMG4S9/Smef54Pgon99LJ/hJ885p0ZAMBCAfCdwQYFggAIBYhBLu2zUyY104TWKdSpgIpOm+k5TRzBQJkdlDOAhsMAAoJEAIpOm+k5TRzBg0A+IbcsZhLx6FRIqBJCdfYMo7qovEo+vX0HZsUPRlq4HkBAIctCzmH3WyfOD/aUTeOF3tY+tIGUxxjQLGsNQZeGrQI
Note that I have implemented my own kind of ridiculous Autocrypt support for the Notmuch Emacs email client I use, see this elisp code. To import keys, I pipe the message into this script which is basically just:
sq autocrypt decode   gpg --import
... thanks to Sequoia best-of-class Autocrypt support.

Note on OpenPGP usage While some have claimed OpenPGP's death, I believe those are overstated. Maybe it's just me, but I still use OpenPGP for my password management, to authenticate users and messages, and it's the interface to my YubiKey for authenticating with SSH servers. I understand people feel that OpenPGP is possibly insecure, counter-intuitive and full of problems, but I think most of those problems should instead be attributed to its current flagship implementation, GnuPG. I have tried to work with GnuPG for years, and it keeps surprising me with evilness and oddities. I have high hopes that the Sequoia project can bring some sanity into this space, and I also hope that RFC4880bis can eventually get somewhere so we have a more solid specification with more robust crypto. It's kind of a shame that this has dragged on for so long, but Update: there's a separate draft called openpgp-crypto-refresh that might actually be adopted as the "OpenPGP RFC" soon! And it doesn't keep real work from happening in Sequoia and other implementations. Thunderbird rewrote their OpenPGP implementation with RNP (which was, granted, a bumpy road because it lost compatibility with GnuPG) and Sequoia now has a certificate store with trust management (but still no secret storage), preliminary OpenPGP card support and even a basic GnuPG compatibility layer. I'm also curious to try out the OpenPGP CA capabilities. So maybe it's just because I'm becoming an old fart that doesn't want to change tools, but so far I haven't seen a good incentive in switching away from OpenPGP, and haven't found a good set of tools that completely replace it. Maybe OpenSSH's keys and CA can eventually replace it, but I suspect they will end up rebuilding most of OpenPGP anyway, just more slowly. If they do, let's hope they avoid the mistakes our community has done in the past at least...

1 August 2023

Reproducible Builds: Supporter spotlight: Simon Butler on business adoption of Reproducible Builds

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about our project and the work that we do. This is the seventh instalment in a series featuring the projects, companies and individuals who support the Reproducible Builds project. We started this series by featuring the Civil Infrastructure Platform project, and followed this up with a post about the Ford Foundation as well as recent ones about ARDC, the Google Open Source Security Team (GOSST), Bootstrappable Builds, the F-Droid project and David A. Wheeler. Today, however, we will be talking with Simon Butler, an associate senior lecturer in the School of Informatics at the University of Sk vde, where he undertakes research in software engineering that focuses on IoT and open source software, and contributes to the teaching of computer science to undergraduates.

Chris: For those who have not heard of it before, can you tell us more about the School of Informatics at Sk vde University? Simon: Certainly, but I may be a little long-winded. Sk vde is a city in the area between the two large lakes in southern Sweden. The city is a busy place. Sk vde is home to the regional hospital, some of Volvo s manufacturing facilities, two regiments of the Swedish defence force, a lot of businesses in the Swedish computer games industry, other tech companies and more. The University of Sk vde is relatively small. Sweden s large land area and low population density mean that regional centres such as Sk vde are important and local universities support businesses by training new staff and supporting innovation. The School of Informatics has two divisions. One focuses on teaching and researching computer games. The other division encompasses a wider range of teaching and research, including computer science, web development, computer security, network administration, data science and so on.
Chris: You recently had a open-access paper published in Software Quality Journal. Could you tell us a little bit more about it and perhaps briefly summarise its key findings? Simon: The paper is one output of a collaborative research project with six Swedish businesses that use open source software. There are two parts to the paper. The first consists of an analysis of what the group of businesses in the project know about Reproducible Builds (R-Bs), their experiences with R-Bs and their perception of the value of R-Bs to the businesses. The second part is an interview study with business practitioners and others with experience and expertise in R-Bs. We set out to try to understand the extent to which software-intensive businesses were aware of R-Bs, the technical and business reasons they were or were not using R-Bs and to document the business and technical use cases for R-Bs. The key findings were that businesses are aware of R-Bs, and some are using R-Bs as part of their day-to-day development process. Some of the uses for R-Bs we found were not previously documented. We also found that businesses understood the value R-Bs have as part of engineering and software quality processes. They are also aware of the costs of implementing R-Bs and that R-Bs are an intangible value proposition - in other words, businesses can add value through process improvement by using R-Bs. But, that, currently at least, R-Bs are not a selling point for software or products.
Chris: You performed a large number of interviews in order to prepare your paper. What was the most surprising response to you? Simon: Most surprising is a good question. Everybody I spoke to brought something new to my understanding of R-Bs, and many responses surprised me. The interviewees that surprised me most were I01 and I02 (interviews were anonymised and interviewees were assigned numeric identities). I02 described the sceptical perspective that there is a viable, pragmatic alternative to R-Bs - verifiable builds - which I was aware of before undertaking the research. The company had developed a sufficiently robust system for their needs and worked well. With a large archive of software used in production, they couldn t justify the cost of retrofitting a different solution that might only offer small advantages over the existing system. Doesn t really sound too surprising, but the interview was one of the first I did on this topic, and I was very focused on the value of, and need for, trust in a system that motivated the R-B. The solution used by the company requires trust, but they seem to have established sufficient trust for their needs by securing their build systems to the extent that they are more or less tamper-proof. The other big surprise for me was I01 s use of R-Bs to support the verification of system configuration in a system with multiple embedded components at boot time. It s such an obvious application of R-Bs, and exactly the kind of response I hoped to get from interviewees. However, it is another instance of a solution where trust is only one factor. In the first instance, the developer is using R-Bs to establish trust in the toolchain. There is also the second application that the developer can use a set of R-Bs to establish that deployed system consists of compatible components. While this might not sound too significant, there appear to be some important potential applications. One that came to mind immediately is a problem with firmware updates on nodes in IoT systems where the node needs to update quickly with limited downtime and without failure. The node also needs to be able to roll back any update proposed by a server if there are conflicts with the current configuration or if any tests on the node fail. Perhaps the chances of failure could be reduced, if a node can instead negotiate with a server to determine a safe path to migrate from its current configuration to a working configuration with the upgraded components the central system requires? Another potential application appears to be in the configuration management of AI systems, where decisions need to be explainable. A means of specifying validated configurations of training data, models and deployed systems might, perhaps, be leveraged to prevent invalid or broken configurations from being deployed in production.
Chris: One of your findings was that reproducible builds were perceived to be good engineering practice . To what extent do you believe cultural forces affect the adoption or rejection of a given technology or practice? Simon: To a large extent. People s decisions are informed by cultural norms, and business decisions are made by people acting collectively. Of course, decision-making, including assessments of risk and usefulness, is mediated by individual positions on the continuum from conformity to non-conformity, as well as individual and in-group norms. Whether a business will consider a given technology for adoption will depend on cultural forces. The decision to adopt may well be made on the grounds of cost and benefits.
Chris: Another conclusion implied by your research is that businesses are often dealing with software deployment lifespans (eg. 20+ years) that differ from widely from those of the typical hobbyist programmer. To what degree do you think this temporal mismatch is a problem for both groups? Simon: This is a fascinating question. Long-term software maintenance is a requirement in some industries because of the working lifespans of the products and legal requirements to maintain the products for a fixed period. For some other industries, it is less of a problem. Consequently, I would tend to divide developers into those who have been exposed to long-term maintenance problems and those who have not. Although, more professional than hobbyist developers will have been exposed to the problem. Nonetheless, there are areas, such as music software, where there are also long-term maintenance challenges for data formats and software.
Chris: Based on your research, what would you say are the biggest blockers for the adoption of reproducible builds within business ? And, based on this, would you have any advice or recommendations for the broader reproducible builds ecosystem? Simon: From the research, the main blocker appears to be cost. Not an absolute cost, but there is an overhead to introducing R-Bs. Businesses (and thus business managers) need to understand the business case for R-Bs. Making decision-makers in businesses aware of R-Bs and that they are valuable will take time. Advocacy at multiple levels appears to be the way forward and this is being done. I would recommend being persistent while being patient and to keep talking about reproducible builds. The work done in Linux distributions raises awareness of R-Bs amongst developers. Guix, NixOS and Software Heritage are all providing practical solutions and getting attention - I ve been seeing progressively more mentions of all three during the last couple of years. Increased awareness amongst developers should lead to more interest within companies. There is also research money being assigned to supply chain security and R-B s. The CHAINS project at KTH in Stockholm is one example of a strategic research project. There may be others that I m not aware of. The policy-level advocacy is slowly getting results in some countries, and where CISA leads, others may follow.
Chris: Was there a particular reason you alighted on the question of the adoption of reproducible builds in business? Do you think there s any truth behind the shopworn stereotype of hacker types neglecting the resources that business might be able to offer? Simon: Much of the motivation for the research came from the contrast between the visibility of R-Bs in open source projects and the relative invisibility of R-Bs in industry. Where companies are known to be using R-Bs (e.g. Google, etc.) there is no fuss, no hype. They were not selling R-Bs as a solution; instead the documentation is very matter-of-fact that R-Bs are part of a customer-facing process in their cloud solutions. An obvious question for me was that if some people use R-B s in software development, why doesn t everybody? There are limits to the tooling for some programming languages that mean R-Bs are difficult or impossible. But where creating an R-B is practical, why are they not used more widely? So, to your second question. There is another factor, which seems to be more about a lack of communication rather than neglecting opportunities. Businesses may not always be willing to discuss their development processes and innovations. Though I do think the increasing number of conferences (big and small) for software practitioners is helping to facilitate more communication and greater exchange of ideas.
Chris: Has your personal view of reproducible builds changed since before you embarked on writing this paper? Simon: Absolutely! In the early stages of the research, I was interested in questions of trust and how R-Bs were applied to resolve build and supply chain security problems. As the research developed, however, I started to see there were benefits to the use of R-Bs that were less obvious and that, in some cases, an R-B can have more than a single application.
Chris: Finally, do you have any plans to do future research touching on reproducible builds? Simon: Yes, definitely. There are a set of problems that interest me. One already mentioned is the use of reproducible builds with AI systems. Interpretable or explainable AI (XAI) is a necessity, and I think that R-Bs can be used to support traceability in the configuration and testing of both deployed systems and systems used during model training and evaluation. I would also like to return to a problem discussed briefly in the article, which is to develop a deeper understanding of the elements involved in the application of R-Bs that can be used to support reasoning about existing and potential applications of R-Bs. For example, R-Bs can be used to establish trust for different groups of individuals at different times, say, between remote developers prior to the release of software and by users after release. One question is whether when an R-B is used might be a significant factor. Another group of questions concerns the ways in which trust (of some sort) propagates among users of an R-B. There is an example in the paper of a company that rebuilds Debian reproducibly for security reasons and is then able to collaborate on software projects where software is built reproducibly with other companies that use public distributions of Debian.
Chris: Many thanks for this interview, Simon. If someone wanted to get in touch or learn more about you and your colleagues at the School of Informatics, where might they go? Thank you for the opportunity. It has been a pleasure to reflect a little more widely on the research! Personally, you can find out about my work on my official homepage and on my personal site. The software systems research group (SSRG) has a website, and the University of Sk vde s English language pages are also available. Chris: Many thanks for this interview, Simon!


For more information about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

30 July 2023

Russell Coker: My Predictions for the Ukraine War

There are a lot of people talking about the Russian invasion of Ukraine and a lot of moving goalposts in such discussions. I think that everyone who wants to advocate for it should publish what they expect to happen and what specific things they consider as victory conditions. When Russia first invaded I thought they would win in a matter of weeks. I underestimated the determination of the Ukrainian people and the corruption and the incompetence and corruption of the Russian military. The first time I thought that Ukraine could win was when I read an analysis of the tires on Russian military vehicles breaking because of the cheapest available tires being bought and then not stored correctly to avoid damage, which led to the long stalled convoy. A successful military campaign requires many more difficult tasks than buying good tires and maintaining them correctly. An army that is too corrupt to buy the bare minimum of usable equipment and too incompetent to adapt to failures is not going to do well. The Ukrainians have done very well with the equipment available, one example is their use of off the shelf drones for dropping grenades into armoured vehicles and for targeting artillery. While the Russians have responded by buying Iranian military drones because they lack the industrial capacity to make their own ones. From the time when the Russians first got bogged down the Ukrainians have been mostly retaking their territory slowly and steadily. The Russians started the invasion with a significant advantage in aircraft, armoured vehicles, artillery, and ammunition. This advantage has been significantly decreased due to losses of vehicles and artillery, high rates of ammunition use, and Ukrainian capture of Russian equipment. The Ukrainians are getting new vehicles, aircraft, artillery, and ammunition from western countries while sanctions are preventing Russians from importing or manufacturing much. Currently one important factor for Russia is the ability of their airforce to attack Ukrainian positions while out of range of Ukrainian air defence systems. The MANPAD systems are good for close support but not good for long range. A problem that the Russians will have in the long term is running out of spare parts and being unable to properly maintain aircraft. This will result in loss of aircraft due to accidents and the inability to repair aircraft that has even minor damage. Here are my specific predictions:
  1. I predict that by the end of 2023 Russia will have a much smaller number of military aircraft through maintenance problems even if Ukraine doesn t get long-range SAM systems.
  2. I predict that by mid 2024 Ukraine will have air superiority. They will destroy many Russian SAM systems and be able to bomb Russian targets with little risk.
  3. I predict that Russia won t impose any significant new conscription programs on their population. Such programs are extremely unpopular and Russia doesn t have the industrial capacity to equip a larger army as they can t properly equip their current army.
  4. Currently Ukraine is making slow but steady progress in retaking their territory in the East. I predict that before the end of 2023 they will have cut all supply lines to Crimea from the mainland by having artillery that can accurately cover all the distance to the coast of the Sea of Azov. I also predict that the bridge over the Kerch strait will be mostly unusable from now on (on average less than 1/3 the bridge capacity usable). As fast as the Russians can repair it the Ukrainians will bomb it again. At most they will have half of the road lanes available to cars and will be unable to transport any significant amount of military equipment.
  5. Due to Russians lacking supplies I predict that Ukraine will recapture at least half the Crimean land area by the end of 2023.
  6. The regions of Luhansk and Donetsk will be the most difficult to capture as they have been held the longest. I predict that the war will not end until Ukraine controls everything within their 2013 borders including Luhansk and Donetsk. The final victory may happen due to the Russian military collapsing or due to a new Russian government ordering a withdrawal.
  7. I predict that Russia will make significant efforts to help Trump get elected in 2024. But even if they succeed it will be too late for him to help them much or change the outcome.
  8. I predict that Ukraine will win this war before the end of 2025. Even if some of my other predictions turn out to be incorrect I predict that by the end of 2025 the military forces of Russia and Ukraine will not be fighting and that it will be because Ukraine has given the Russian military a proper spanking. If something like the Troubles in Ireland happens (which is a real possibility) that doesn t count as a war.
  9. I predict that Ukraine will not deploy any significant attack inside Russian territory. They will launch small scale attacks on specific military targets but do nothing that the Russian population might consider to be full scale war.
  10. I predict that Putin will not lead Russia 2 months after Ukraine recaptures all their territory. He may not live for long after Ukraine wins, or the Russian withdrawal might happen because Putin dies of apparently natural causes.
  11. After the war I predict that Ukraine will control all their territory from 2013 and there will be a demilitarised zone or no-fly zone in Russian territory.
  12. I predict that after the war some parts of the Russian Federation will break free. There are many different groups who would like to be free of Russia and Ukraine destroying most of the Russian military will make things easy for them. A Russian civil war is a possibility.
  13. I predict that the US will give minimal support to Russia after the war as a strategic plan to block China. I predict that the quality and efficacy of such support will be comparable to the US actions in the Middle East.
I welcome comments disagreeing with this. But please make specific predictions that can be tested and sign your name to them. If you don t think that a certain event will happen when I predict it then provide a date when you think it will happen or a date by which the opposite will have happened. Also please show enough confidence to make multiple predictions. I ve made 12 specific predictions, if you think I m doing badly then make at least 3 specific competing predictions. If you think that Russia will win then define what a win means in terms of territory occupied when fighting between armies ends and when that will happen. Also if you think that Russia will win then please make a prediction about whether there will be a Ukrainian equivalent of the IRA and if so what they will do.

23 July 2023

Wouter Verhelst: Debconf Videoteam sprint in Paris, France, 2023-07-20 - 2023-07-23

The DebConf video team has been sprinting in preparation for DebConf 23 which will happen in Kochi, India, in September of this year. Video team sprint Present were Nicolas "olasd" Dandrimont, Stefano "tumbleweed" Rivera, and yours truly. Additionally, Louis-Philippe "pollo" V ronneau and Carl "CarlFK" Karsten joined the sprint remotely from across the pond. Thank you to the DPL for agreeing to fund flights, food, and accomodation for the team members. We would also like to extend a special thanks to the Association April for hosting our sprint at their offices. We made a lot of progress: It is now Sunday the 23rd at 14:15, and while the sprint is coming to an end, we haven't quite finished yet, so some more progress can still be made. Let's see what happens by tonight. All in all, though, we believe that the progress we made will make the DebConf Videoteam's work a bit easier in some areas, and will make things work better in the future. See you in Kochi!

19 July 2023

Shirish Agarwal: RISC-V, Chips Act, Burning of Books, Manipur

RISC -V Motherboard, SBC While I didn t want to, a part of me is hyped about this motherboard. This would probably be launched somewhere in November. There are obvious issues in this, the first being unlike regular motherboards you wouldn t be upgrade as you would do.You can t upgrade your memory, can t upgrade the CPU (although new versions of instructions could be uploaded, similar to BIOS updates) but as the hardware is integrated (the quad-core SiFive Performance P550 core complex) it would really depend. If the final pricing is around INR 4-5k then it may be able to sell handsomely provided there are people to push and provide support around it. A 500 GB or 1 TB SSD coupled with it and a cheap display unit and you could use it anywhere although as the name says it s more for tinkering as the name suggests. Another board that could perhaps be of more immediate use would be the beagleboard. They launched the same couple of days back and called it Beagle V-Ahead. Again, costs are going to be a concern. Just a year before the pandemic the Beagleboard Black (BB) used to cost in the sub 4k range, today it costs 8k+ for the end user, more than twice the price. How much Brexit is to be blamed for this and how much the Indian customs we would never know. The RS Group that is behind that shop is head-quartered in the UK. As said before, we do not know the price of either board as it probably will take few months for v-ahead to worm its way in the Indian market, maybe another 6 months or so. Even so, with the limited info. on both the boards, I am tilting more towards the other HiFive one. We should come to know about the boards say in 3-5 months of time.

CHIPS Act I had shared about the Chips Act a few times here as well as on SM. Two articles do tell how the CHIPS Act 2023 is more of a political tool, an industrial defence policy rather than just business as most people tend to think.

Cancelation of Books, Books Burning etc. Almost 2400 years ago, Plato released his work called Plato s Republic and one of the seminal works within it is perhaps one of the most famous works was the Allegory of the Cave. That is used again and again in a myriad ways, mostly in science-fiction though and mostly to do with utopian, dystopian movies, webseries etc. I did share how books are being canceled in the States, also a bit here. But the most damning thing has happened throughout history, huge quantities of books burned almost all for politics  But part of it has been neglect as well as this time article shares. What we have lost and continue to lose is just priceless. Every book has a grain of truth in it, some more, some less but equally enjoyable. Most harmful is the neglect towards books and is more true today than any other time in history. Kids today have a wide variety of tools to keep themselves happy or occupied, from anime, VR, gaming the list goes on and on. In that scenario, how the humble books can compete. People think of Kindle but most e-readers like Kindle are nothing but obsolescence by design. I have tried out Kindle a few times but find it a bit on the flimsy side. Books are much better IMHO or call me old-school. While there are many advantages, one of the things that I like about books is that you can easily put yourself in either the protagonist or the antagonist or somewhere in the middle and think of the possible scenarios wherever you are in a particular book. I could go on but it will be a blog post or two in itself. Till later. Happy Reading.

Update:Manipur Extremely horrifying visuals, articles and statements continue to emanate from Manipur. Today, 19th July 2023, just couple of hours back, a video surfaced showing two Kuki women were shown as stripped, naked and Meitei men touching their private parts. Later on, we came to know that this was in response of a disinformation news spread by the Meitis of few women being raped although no documentary evidence of the same surfaced, no names nothing. While I don t want to share the video I will however share the statement shared by the Kuki-Zo tribal community on that. The print gives a bit more context to what has been going on.
Update, Few hours later : The Print also shared more of a context about six days ago. The reason we saw the video now was that for the last 2.5 months Manipur was in Internet shutdown so those videos got uploaded now. There was huge backlash from the Twitter community and GOI ordered the Manipur Police to issue this Press Release yesterday night or just few hours before with yesterday s time-stamp.
IndianExpress shared an article that does state that while an FIR had been registered immediately no arrests so far and this is when you can see the faces of all the accused. Not one of them tried to hide their face behind a mask or something. So, if the police wanted, they could have easily identified who they are. They know which community the accused belong to, they even know from where they came. If they wanted to, they could have easily used mobile data and triangulation to find the accused and their helpers. So, it does seem to be attempt to whitewash and protect a certain community while letting it prey on the other. Another news that did come in, is because of the furious reaction on Twitter, Youtube has constantly been taking down the video as some people are getting a sort of high more so from the majoritarian community and making lewd remarks. Twitter has been somewhat quick when people are making lewd remarks against the two girl/women. Quite a bit of the above seems like a cover-up. Lastly, apparently GOI has agreed to having a conversation about it in Lok Sabha but without any voting or passing any resolutions as of right now. Would update as an when things change. Update: Smriti Irani, the Child and Development Minister gave the weakest statement possible
As can be noticed, she said sexual assault rather than rape. The women were under police custody for safety when they were whisked away by the mob. No mention of that. She spoke to the Chief Minister who has been publicly known as one of the provocateurs or instigators for the whole thing. The CM had publicly called the Kukus and Nagas as foreigners although both of them claim to be residing for thousands of years and they apparently have documentary evidence of the same  . Also not clear who is doing the condemning here. No word of support for the women, no offer of intervention, why is she the Minister of Child and Women Development (CDW) if she can t use harsh words or give support to the women who have gone and going through horrific things  Update : CM Biren Singh s Statement after the video surfaced
This tweet is contradictory to the statements made by Mr. Singh couple of months ago. At that point in time, Mr. Singh had said that NIA, State Intelligence Departments etc. were giving him minute to minute report on the ground station. The Police itself has suo-moto (on its own) powers to investigate and apprehend criminals for any crime. In fact, the Police can call for questioning of anybody in any relation to any crime and question them for upto 48 hours before charging them. In fact, many cases have been lodged where innocent persons have been framed or they have served much more in the jail than the crime they are alleged to have been committed. For e.g. just a few days before there was a media report of a boy who has been in jail for 3 years. His alleged crime, stealing mere INR 200/- to feed himself. Court doesn t have time to listen to him yet. And there are millions like him. The quint eloquently shares the tale where it tells how both the State and the Centre have been explicitly complicit in the incidents ravaging Manipur. In fact, what has been shared in the article has been very true as far as greed for land is concerned. Just couple of weeks back there have been a ton of floods emanating from Uttarakhand and others. Just before the flooding began, what was the CM doing can be seen here. Apart from the newspapers I have shared and the online resources, most of the mainstream media has been silent on the above. In fact, they have been silent on the Manipur issue until the said video didn t come into limelight. Just now, in Lok Sabha everybody is present except the Prime Minister and the Home Minister. The PM did say that the law will take its own course, but that s about it. Again no support for the women concerned.  Update: CJI (Chief Justice of India) has taken suo-moto cognizance and has warned both the State and Centre to move quickly otherwise they will take the matter in their own hand.
Update: Within 2 hours of the CJI taking suo-moto cognizance, they have arrested one of the main accused Heera Das
The above tells you why the ban on Internet was put in the first place. They wanted to cover it all up. Of all the celebs, only one could find a bit of spine, a bit of backbone to speak about it, all the rest mum
Just imagine, one of the women is around my age while the young one could have been a daughter if I had married on time or a younger sister for sure. If ever I came face to face with them, I just wouldn t be able to look them in the eye. Even their whole whataboutery is built on sham. From their view Kukis are from Burma or Burmese descent. All of which could be easily proved by DNA of all. But let s leave that for a sec. Let s take their own argument that they are Burmese. Their idea of Akhand Bharat stretches all the way to Burma (now called Myanmar). They want all the land but no idea with what to do with the citizens living on it. Even after the video, the whataboutery isn t stopping, that shows how much hatred is there. And not knowing that they too will be victim of the same venom one or the other day  Update: Opposition was told there would be a debate on Manipur. The whole day went by, no debate. That s the shamelessness of this Govt.  Update 20th July 19:25 Center may act or not act against the perpetrators but they will act against Twitter who showed the crime. Talk about shooting the messenger
We are now in the last stage. In 2014, we were at 6

4 June 2023

Debian Brasil: Oficina de tradu o do Manual do(a) Administrador(a) Debian em 13 de junho

A equipe de tradu o do Debian para o portugu s do Brasil realizar , no dia 13 de junho a partir das 20h, uma oficina de tradu o do Manual do(a) Administrador(a) Debian (The Debian Administrator's Handbook). O objetivo mostrar aos( s) iniciantes como colaborar na tradu o deste importante material, que existe desde 2004 e vem sendo traduzido para o portugu s ao longo dos anos. Agora a tradu o precisa ser atualizada para a vers o 12 do Debian (bookworm), que ser lan ada este m s. A ferramenta usada para traduzir o Manual o site weblate, ent o voc j pode criar sua conta e acessar o Projeto Debian Handbook para se ambientar. A oficina acontecer no formato online, e o link para participar da sala no jitsi ser divulgado no grupo debl10nptBR no telegram e no canal #debian-l10n-br do IRC.

31 May 2023

Arturo Borrero Gonz lez: Wikimedia Hackathon 2023 Athens summary

Post logo During the weekend of 19-23 May 2023 I attended the Wikimedia hackathon 2023 in Athens, Greece. The event physically reunited folks interested in the more technological aspects of the Wikimedia movement in person for the first time since 2019. The scope of the hacking projects include (but was not limited to) tools, wikipedia bots, gadgets, server and network infrastructure, data and other technical systems. My role in the event was two-fold: on one hand I was in the event because of my role as SRE in the Wikimedia Cloud Services team, where we provided very valuable services to the community, and I was expected to support the technical contributors of the movement that were around. Additionally, and because of that same role, I did some hacking myself too, which was specially augmented given I generally collaborate on a daily basis with some community members that were present in the hacking room. The hackathon had some conference-style track and I ran a session with my coworker Bryan, called Past, Present and Future of Wikimedia Cloud Services (Toolforge and friends) (slides) which was very satisfying to deliver given the friendly space that it was. I attended a bunch of other sessions, and all of them were interesting and well presented. The number of ML themes that were present in the program schedule was exciting. I definitely learned a lot from attending those sessions, from how LLMs work, some fascinating applications for them in the wikimedia space, to what were some industry trends for training and hosting ML models. Session Despite the sessions, the main purpose of the hackathon was, well, hacking. While I was in the hacking space for more than 12 hours each day, my ability to get things done was greatly reduced by the constant conversations, help requests, and other social interactions with the folks. Don t get me wrong, I embraced that reality with joy, because the social bonding aspect of it is perhaps the main reason why we gathered in person instead of virtually. That being said, this is a rough list of what I did: The hackathon was also the final days of Technical Engagement as an umbrella group for WMCS and Developer Advocacy teams within the Technology department of the Wikimedia Foundation because of an internal reorg.. We used the chance to reflect on the pleasant time we have had together since 2019 and take a final picture of the few of us that were in person in the event. Technical Engagement It wasn t the first Wikimedia Hackathon for me, and I felt the same as in previous iterations: it was a welcoming space, and I was surrounded by friends and nice human beings. I ended the event with a profound feeling of being privileged, because I was part of the Wikimedia movement, and because I was invited to participate in it.

26 May 2023

Valhalla's Things: Late Victorian Combinations

Posted on May 26, 2023
A woman wearing a white linen combination suite, with a very fitted top, small sleevelets that cover the armpits (to protect the next layers from sweat) and split drawers. The suite buttons up along the front (where it is a bit tight around the bust) and has a line of lace at the neckline and two tucks plus some lace at the legs. Some time ago, on an early Friday afternoon our internet connection died. After a reasonable time had passed we called the customer service, they told us that they would look into it and then call us back. On Friday evening we had not heard from them, and I was starting to get worried. At the time in the evening when I would have been relaxing online I grabbed the first Victorian sewing-related book I found on my hard disk and started to read it. For the record, it wasn t actually Victorian, it was Margaret J. Blair. System of Sewing and Garment Drafting. from 1904, but I also had available for comparison the earlier and smaller Margaret Blair. System of Garment Drafting. from 1897. A page from the book showing the top part of a pattern with all construction lines Anyway, this book had a system to draft a pair of combinations (chemise top + drawers); and months ago I had already tried to draft a pair from another system, but they didn t really fit and they were dropped low on the priority list, so on a whim I decided to try and draft them again with this new-to-me system. Around 23:00 in the night the pattern was ready, and I realized that my SO had gone to sleep without waiting for me, as I looked too busy to be interrupted. The next few days were quite stressful (we didn t get our internet back until Wednesday) and while I couldn t work at my day job I didn t sew as much as I could have done, but by the end of the week I had an almost complete mockup from an old sheet, and could see that it wasn t great, but it was a good start. One reason why the mockup took a whole week is that of course I started to sew by machine, but then I wanted flat-felled seams, and felling them by hand is so much neater, isn t it? And let me just say, I m grateful for the fact that I don t depend on streaming services for media, but I have a healthy mix of DVDs and stuff I had already temporary downloaded to watch later, because handsewing and being stressed out without watching something is not really great. Anyway, the mockup was a bit short on the crotch, but by the time I could try it on and be sure I was invested enough in it that I decided to work around the issue by inserting a strip of lace around the waist. And then I went back to the pattern to fix it properly, and found out that I had drafted the back of the drawers completely wrong, making a seam shorter rather than longer as it should have been. ooops. I fixed the pattern, and then decided that YOLO and cut the new version directly on some lightweight linen fabric I had originally planned to use in this project. The result is still not perfect, but good enough, and I finished it with a very restrained amount of lace at the neckline and hems, wore it one day when the weather was warm (loved the linen on the skin) and it s ready to be worn again when the weather will be back to being warm (hopefully not too soon). The last problem was taking pictures of this underwear in a way that preserves the decency (and it even had to be outdoors, for the light!). This was solved by wearing leggings and a matched long sleeved shirt under the combinations, and then promptly forgetting everything about decency and, well, you can see what happened. A woman mooning by keeping the back of split drawers open with her hands, but at least there are black leggings under them. The pattern is, as usual, published on my pattern website as #FreeSoftWear. And then, I started thinking about knits. In the late Victorian and Edwardian eras knit underwear was a thing, also thanks to the influence of various aspects of the rational dress movement; reformers such as Gustav J ger advocated for wool underwear, but mail order catalogues from the era such as https://archive.org/details/cataloguefallwin00macy (starting from page 67) have listings for both cotton and wool ones. From what I could find, back then they would have been either handknit at home or made to shape on industrial knitting machines; patterns for the former are available online, but the latter would probably require a knitting machine that I don t currently1 have. However, this is underwear that is not going to be seen by anybody2, and I believe that by using flat knit fabric one can get a decent functional approximation. In The Stash I have a few meters of a worked cotton jersey with a pretty comfy feel, and to make a long story short: this happened. a woman wearing a black cotton jersey combination suite; the front is sewn shut, but the neck is wide and finished with elastic.  The top part is pretty fitted, but becomes baggier around the crotch area and the legs are a comfortable width. I suspect that the linen one will get worn a lot this summer (linen on the skin. nothing else need to be said), while the cotton one will be stored away for winter. And then maybe I may make a couple more, if I find out that I m using it enough.

  1. cue ominous music. But first I would need space to actually keep and use it :)
  2. other than me, my SO, any costuming friend I may happen to change in the presence of, and everybody on the internet in these pictures.

25 April 2023

B lint R czey: Improve build time of Rust, Java and Intel Fortran projects with Firebuild s new release!

Rust is a hugely popular compiled programming language and accelerating it was an important goal for Firebuild for some time. Firebuild s v0.8.0 release finally added Rust support in addition to numerous other improvements including support for Doxygen, Intel s Fortran compiler and restored javac and javadoc acceleration.

Firebuild s Rust + Cargo support Firebuild treats programs as black boxes intercepting C standard library calls and system calls. It shortcuts the program invocations that predictably generate the same outputs because the program itself is known to be deterministic and all inputs are known in advance. Rust s compiler, rustc is deterministic in itself and simple rustc invocations were already accelerated, but parallel builds driven by Cargo needed a few enhancements in Firebuild.

Cargo s jobserver Cargo uses the Rust variant of the GNU Make s jobserver to control the parallelism in a build. The jobserver creates a file descriptor from which descendant processes can read tokens and are allowed to run one extra thread or parallel process per token received. After the extra threads or processes are finished the tokens must be returned by writing to the other file descriptor the jobserver created. The jobserver s file descriptors are shared with the descendant processes via environment variables:
# rustc's environment variables
...
CARGO_MAKEFLAGS="-j --jobserver-fds=4,5 --jobserver-auth=4,5"
...
Since getting tokens from the jobserver involves reading them as nondeterministic bytes from an inherited file descriptor this is clearly an operation that would depend on input not known in advance. Firebuild needs to make an exception and ignore jobserver usage related reads and writes since they are not meant to change the build results. However, there are programs not caring at all about jobservers and their file descriptors. They happily close the inherited file descriptors and open new ones with the same id, to use them for entirely different purposes. One such program is the widely used ./configure script, thus the case is far from being theoretical. To stay on the safe side firebuild ignores jobserver fd usage only in programs which are known to use the jobserver properly. The list of the programs is now configurable in /etc/firebuild.conf and since rustc is on the list by default parallel Rust builds are accelerated out of the box!

Writable dependency dir The other issue that prevented highly accelerated Rust builds was rustc s -L dependency=<dir> parameter. This directory is populated in a not fully deterministic order in parallel builds. Firebuild on the other hand hashes directory listings of open()-ed directories treating them as inputs assuming that the directory content will influence the intercepted programs outputs. As rustc programs started in parallel scanned the dependency directory in different states depending on what other Rust compilations finished already Firebuild had to store the full directory content as an input for each rustc cache entry resulting low hit rate when rustc was started again with otherwise identical inputs. The solution here is ignoring rustc scanning the dependency directory, because the dependencies actually used are still treated as input and are checked when shortcutting rustc. With that implemented in firebuild, too, librsvg s build that uses Rust and Cargo can be accelerated by more than 90%, even on a system having 12 cores/24 threads!:
Firebuild accelerating librsvg s Rust + Cargo build from 38s to 2.8s on a Ryzen 5900X (12C/24T) system

On the way to accelerate anything Firebuild s latest release incorporated more than 100 changes just from the last two months. They unlocked acceleration of Rust builds with Cargo, fixed Firebuild to work with the latest Java update that slightly changed its behavior, started accelerating Intel s Fortran compiler in addition to accelerating gfortran that was already supported and included many smaller changes improving the acceleration of other compilers and tools. If your favorite toolchain is not mentioned, there is still a good chance that it is already supported. Give Firebuild a try and tell us about your experience! Update 1: Comparison to sccache came up in the reddit topic about Firebuild s Rust acceleration , thus by popular demand this is how sccache performs on the same project:
Firebuild 0.8.0 vs. sccache 0.4.2 accelerating librsvg s Rust + Cargo build
All builds took place on the same Ryzen 5900X system with 12 cores / 24 threads in LXC containers limited to using 1-12 virtual CPUs. A warm-up build took place before the vanilla (without any instrumentation) build to download and compile the dependency crates to measure only the project s build time. A git clean command cleared all the build artifacts from the project directory before each build and ./autogen.sh was run to measure only clean rebuilds (without autotools). See test configuration in the Firebuild performance test repository for more details and easy reproduction. Firebuild had lower overhead than sccache (2.83% vs. 6.10% on 1 CPU and 7.71% vs. 22.05% on 12 CPUs) and made the accelerated build finish much faster (2.26% vs. 19.41% of vanilla build s time on 1 CPU and 7.5% vs. 27.4% of vanilla build s time on 12 CPUs).

23 April 2023

Petter Reinholdtsen: Speech to text, she APTly whispered, how hard can it be?

While visiting a convention during Easter, it occurred to me that it would be great if I could have a digital Dictaphone with transcribing capabilities, providing me with texts to cut-n-paste into stuff I need to write. The background is that long drives often bring up the urge to write on texts I am working on, which of course is out of the question while driving. With the release of OpenAI Whisper, this seem to be within reach with Free Software, so I decided to give it a go. OpenAI Whisper is a Linux based neural network system to read in audio files and provide text representation of the speech in that audio recording. It handle multiple languages and according to its creators even can translate into a different language than the spoken one. I have not tested the latter feature. It can either use the CPU or a GPU with CUDA support. As far as I can tell, CUDA in practice limit that feature to NVidia graphics cards. I have few of those, as they do not work great with free software drivers, and have not tested the GPU option. While looking into the matter, I did discover some work to provide CUDA support on non-NVidia GPUs, and some work with the library used by Whisper to port it to other GPUs, but have not spent much time looking into GPU support yet. I've so far used an old X220 laptop as my test machine, and only transcribed using its CPU. As it from a privacy standpoint is unthinkable to use computers under control of someone else (aka a "cloud" service) to transcribe ones thoughts and personal notes, I want to run the transcribing system locally on my own computers. The only sensible approach to me is to make the effort I put into this available for any Linux user and to upload the needed packages into Debian. Looking at Debian Bookworm, I discovered that only three packages were missing, tiktoken, triton, and openai-whisper. For a while I also believed ffmpeg-python was needed, but as its upstream seem to have vanished I found it safer to rewrite whisper to stop depending on in than to introduce ffmpeg-python into Debian. I decided to place these packages under the umbrella of the Debian Deep Learning Team, which seem like the best team to look after such packages. Discussing the topic within the group also made me aware that the triton package was already a future dependency of newer versions of the torch package being planned, and would be needed after Bookworm is released. All required code packages have been now waiting in the Debian NEW queue since Wednesday, heading for Debian Experimental until Bookworm is released. An unsolved issue is how to handle the neural network models used by Whisper. The default behaviour of Whisper is to require Internet connectivity and download the model requested to ~/.cache/whisper/ on first invocation. This obviously would fail the deserted island test of free software as the Debian packages would be unusable for someone stranded with only the Debian archive and solar powered computer on a deserted island. Because of this, I would love to include the models in the Debian mirror system. This is problematic, as the models are very large files, which would put a heavy strain on the Debian mirror infrastructure around the globe. The strain would be even higher if the models change often, which luckily as far as I can tell they do not. The small model, which according to its creator is most useful for English and in my experience is not doing a great job there either, is 462 MiB (deb is 414 MiB). The medium model, which to me seem to handle English speech fairly well is 1.5 GiB (deb is 1.3 GiB) and the large model is 2.9 GiB (deb is 2.6 GiB). I would assume everyone with enough resources would prefer to use the large model for highest quality. I believe the models themselves would have to go into the non-free part of the Debian archive, as they are not really including any useful source code for updating the models. The "source", aka the model training set, according to the creators consist of "680,000 hours of multilingual and multitask supervised data collected from the web", which to me reads material with both unknown copyright terms, unavailable to the general public. In other words, the source is not available according to the Debian Free Software Guidelines and the model should be considered non-free. I asked the Debian FTP masters for advice regarding uploading a model package on their IRC channel, and based on the feedback there it is still unclear to me if such package would be accepted into the archive. In any case I wrote build rules for a OpenAI Whisper model package and modified the Whisper code base to prefer shared files under /usr/ and /var/ over user specific files in ~/.cache/whisper/ to be able to use these model packages, to prepare for such possibility. One solution might be to include only one of the models (small or medium, I guess) in the Debian archive, and ask people to download the others from the Internet. Not quite sure what to do here, and advice is most welcome (use the debian-ai mailing list). To make it easier to test the new packages while I wait for them to clear the NEW queue, I created an APT source targeting bookworm. I selected Bookworm instead of Bullseye, even though I know the latter would reach more users, is that some of the required dependencies are missing from Bullseye and I during this phase of testing did not want to backport a lot of packages just to get up and running. Here is a recipe to run as user root if you want to test OpenAI Whisper using Debian packages on your Debian Bookworm installation, first adding the APT repository GPG key to the list of trusted keys, then setting up the APT repository and finally installing the packages and one of the models:
curl https://geekbay.nuug.no/~pere/openai-whisper/D78F5C4796F353D211B119E28200D9B589641240.asc \
  -o /etc/apt/trusted.gpg.d/pere-whisper.asc
mkdir -p /etc/apt/sources.list.d
cat > /etc/apt/sources.list.d/pere-whisper.list <<EOF
deb https://geekbay.nuug.no/~pere/openai-whisper/ bookworm main
deb-src https://geekbay.nuug.no/~pere/openai-whisper/ bookworm main
EOF
apt update
apt install openai-whisper
The package work for me, but have not yet been tested on any other computer than my own. With it, I have been able to (badly) transcribe a 2 minute 40 second Norwegian audio clip to test using the small model. This took 11 minutes and around 2.2 GiB of RAM. Transcribing the same file with the medium model gave a accurate text in 77 minutes using around 5.2 GiB of RAM. My test machine had too little memory to test the large model, which I believe require 11 GiB of RAM. In short, this now work for me using Debian packages, and I hope it will for you and everyone else once the packages enter Debian. Now I can start on the audio recording part of this project. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

3 April 2023

Russ Allbery: Review: The Nordic Theory of Everything

Review: The Nordic Theory of Everything, by Anu Partanen
Publisher: Harper
Copyright: 2016
Printing: June 2017
ISBN: 0-06-231656-7
Format: Kindle
Pages: 338
Anu Partanen is a Finnish journalist who immigrated to the United States. The Nordic Theory of Everything, subtitled In Search of a Better Life, is an attempt to explain the merits of Finnish approaches to government and society to a US audience. It was her first book. If you follow US policy discussion at all, you have probably been exposed to many of the ideas in this book. There was a time when the US left was obsessed with comparisons between the US and Nordic countries, and while that obsession has faded somewhat, Nordic social systems are still discussed with envy and treated as a potential model. Many of the topics of this book are therefore predictable: parental leave, vacation, health care, education, happiness, life expectancy, all the things that are far superior in Nordic countries than in the United States by essentially every statistical measure available, and which have been much-discussed. Partanen brings two twists to this standard analysis. The first is that this book is part memoir: she fell in love with a US writer and made the decision to move to the US rather than asking him to move to Finland. She therefore experienced the transition between social and government systems first-hand and writes memorably on the resulting surprise, trade-offs, anxiety, and bafflement. The second, which I've not seen previously in this policy debate, is a fascinating argument that Finland is a far more individualistic country than the United States precisely because of its policy differences.
Most people, including myself, assumed that part of what made the United States a great country, and such an exceptional one, was that you could live your life relatively unencumbered by the downside of a traditional, old-fashioned society: dependency on the people you happened to be stuck with. In America you had the liberty to express your individuality and choose your own community. This would allow you to interact with family, neighbors, and fellow citizens on the basis of who you were, rather than on what you were obligated to do or expected to be according to old-fashioned thinking. The longer I lived in America, therefore, and the more places I visited and the more people I met and the more American I myself became the more puzzled I grew. For it was exactly those key benefits of modernity freedom, personal independence, and opportunity that seemed, from my outsider s perspective, in a thousand small ways to be surprisingly missing from American life today. Amid the anxiety and stress of people s daily lives, those grand ideals were looking more theoretical than actual.
The core of this argument is that the structure of life in the United States essentially coerces dependency on other people: employers, spouses, parents, children, and extended family. Because there is no universally available social support system, those relationships become essential for any hope of a good life, and often for survival. If parents do not heavily manage their children's education, there is a substantial risk of long-lasting damage to the stability and happiness of their life. If children do not care for their elderly parents, they may receive no care at all. Choosing not to get married often means choosing precarity and exhaustion because navigating society without pooling resources with someone else is incredibly difficult.
It was as if America, land of the Hollywood romance, was in practice mired in a premodern time when marriage was, first and foremost, not an expression of love, but rather a logistical and financial pact to help families survive by joining resources.
Partanen contrasts this with what she calls the Nordic theory of love:
What Lars Tr g rdh came to understand during his years in the United States was that the overarching ambition of Nordic societies during the course of the twentieth century, and into the twenty-first, has not been to socialize the economy at all, as is often mistakenly assumed. Rather the goal has been to free the individual from all forms of dependency within the family and in civil society: the poor from charity, wives from husbands, adult children from parents, and elderly parents from their children. The express purpose of this freedom is to allow all those human relationships to be unencumbered by ulterior motives and needs, and thus to be entirely free, completely authentic, and driven purely by love.
She sees this as the common theme through most of the policy differences discussed in this book. The Finnish approach is to provide neutral and universal logistical support for most of life's expected challenges: birth, child-rearing, education, health, unemployment, and aging. This relieves other social relations family, employer, church of the corrosive strain of dependency and obligation. It also ensures people's basic well-being isn't reliant on accidents of association.
If the United States is so worried about crushing entrepreneurship and innovation, a good place to start would be freeing start-ups and companies from the burdens of babysitting the nation s citizens.
I found this fascinating as a persuasive technique. Partanen embraces the US ideal of individualism and points out that, rather than being collectivist as the US right tends to assume, Finland is better at fostering individualism and independence because the government works to removes unnecessary premodern constraints on individual lives. The reason why so many Americans are anxious and frantic is not a personal failing or bad luck. It's because the US social system is deeply hostile to healthy relationships and individual independence. It demands a constant level of daily problem-solving and crisis management that is profoundly exhausting, nearly impossible to navigate alone, and damaging to the ideal of equal relationships. Whether this line of argument will work is another question, and I'm dubious for reasons that Partanen (probably wisely) avoids. She presents the Finnish approach as a discovery that the US would benefit from, and the US approach as a well-intentioned mistake. I think this is superficially appealing; almost all corners of US political belief at least give lip service to individualism and independence. However, advocates of political change will eventually need to address the fact that many US conservatives see this type of social coercion as an intended feature of society rather than a flaw. This is most obvious when one looks at family relationships. Partanen treats the idea that marriage should be a free choice between equals rather than an economic necessity as self-evident, but there is a significant strain of US political thought that embraces punishing people for not staying within the bounds of a conservative ideal of family. One will often find, primarily but not exclusively among the more religious, a contention that the basic unit of society is the (heterosexual, patriarchal) family, not the individual, and that the suffering of anyone outside that structure is their own fault. Not wanting to get married, be the primary caregiver for one's parents, or abandon a career in order to raise children is treated as malignant selfishness and immorality rather than a personal choice that can be enabled by a modern social system. Here, I think Partanen is accurate to identify the Finnish social system as more modern. It embraces the philosophical concept of modernity, namely that social systems can be improved and social structures are not timeless. This is going to be a hard argument to swallow for those who see the pressure towards forming dependency ties within families as natural, and societal efforts to relieve those pressures as government meddling. In that intellectual framework, rather than an attempt to improve the quality of life, government logistical support is perceived as hostility to traditional family obligations and an attempt to replace "natural" human ties with "artificial" dependence on government services. Partanen doesn't attempt to have that debate. Two other things struck me in this book. The first is that, in Partanen's presentation, Finns expect high-quality services from their government and work to improve it when it falls short. This sounds like an obvious statement, but I don't think it is in the context of US politics, and neither does Partanen. She devotes a chapter to the topic, subtitled "Go ahead: ask what your country can do for you." This is, to me, one of the most frustrating aspects of US political debate. Our attitude towards government is almost entirely hostile and negative even among the political corners that would like to see government do more. Failures of government programs are treated as malice, malfeasance, or inherent incompetence: in short, signs the program should never have been attempted, rather than opportunities to learn and improve. Finland had mediocre public schools, decided to make them better, and succeeded. The moment US public schools start deteriorating, we throw much of our effort into encouraging private competition and dismantling the public school system. Partanen doesn't draw this connection, but I see a link between the US desire for market solutions to societal problems and the level of exhaustion and anxiety that is so common in US life. Solving problems by throwing them open to competition is a way of giving up, of saying we have no idea how to improve something and are hoping someone else will figure it out for a profit. Analyzing the failures of an existing system and designing incremental improvements is hard and slow work. Throwing out the system and hoping some corporation will come up with something better is disruptive but easy. When everyone is already overwhelmed by life and devoid of energy to work on complex social problems, it's tempting to give up on compromise and coalition-building and let everyone go their separate ways on their own dime. We cede the essential work of designing a good society to start-ups. This creates a vicious cycle: the resulting market solutions are inevitably gated by wealth and thus precarious and artificially scarce, which in turn creates more anxiety and stress. The short-term energy savings from not having to wrestle with a hard problem is overwhelmed by the long-term cost of having to navigate a complex and adversarial economic relationship. That leads into the last point: schools. There's a lot of discussion here about school quality and design, which I won't review in detail but which is worth reading. What struck me about Partanen's discussion, though, is how easy the Finnish system is to use. Finnish parents just send their kids to the most convenient school and rarely give that a second thought. The critical property is that all the schools are basically fine, and therefore there is no need to place one's child in an exceptional school to ensure they have a good life. It's axiomatic in the US that more choice is better. This is a constant refrain in our political discussion around schools: parental choice, parental control, options, decisions, permission, matching children to schools tailored for their needs. Those choices are almost entirely absent in Finland, at least in Partanen's description, and the amount of mental and emotional energy this saves is astonishing. Parents simply don't think about this, and everything is fine. I think we dramatically underestimate the negative effects of constantly having to make difficult decisions with significant consequences, and drastically overstate the benefits of having every aspect of life be full of major decision points. To let go of that attempt at control, however illusory, people have to believe in a baseline of quality that makes the choice less fraught. That's precisely what Finland provides by expecting high-quality social services and working to fix them when they fall short, an effort that the United States has by and large abandoned. A lot of non-fiction books could be turned into long articles without losing much substance, and I think The Nordic Theory of Everything falls partly into that trap. Partanen repeats the same ideas from several different angles, and the book felt a bit padded towards the end. If you're already familiar with the policy comparisons between the US and Nordic countries, you will have seen a lot of this before, and the book bogs down when Partanen strays too far from memoir and personal reactions. But the focus on individualism and eliminating dependency is new, at least to me, and is such an illuminating way to look at the contrast that I think the book is worth reading just for that. Rating: 7 out of 10

1 April 2023

Debian Brasil: MiniDebConf Bras lia 2023 - 25 a 27 de maio

Nesse ano a MiniDebConf Brasil est de volta! A comunidade brasileira de usu rios(as) e desenvolvedores(as) Debian convida a todos(as) a participarem da MiniDebConf Bras lia 2023 que acontecer durante 3 dias na capital federal. Nos dias 25 e 26 de maio estaremos no Complexo Avan ado da C mara dos Deputados - LabHacker/CEFOR, promovendo palestras, oficinas e outras atividades. E, no - dia 27 de maio (s bado), estaremos em um coworking (local a definir) para - colocar a m o na massa hackeando software e contribuindo para o Projeto Debian. A MiniDebConf Bras lia 2023 um evento aberto a todos(as), independente do seu n vel de conhecimento sobre Debian. O mais importante ser reunir a comunidade para celebrar o maior projeto de Software Livre do mundo, por isso queremos receber desde usu rios(as) inexperientes que est o iniciando o seu contato com o Debian at Desenvolvedores(as) oficiais do projeto. A programa o da MiniDebConf Bras lia 2023 ser composta de palestras de n vel b sico e intermedi rio para aqueles(as) participantes que est o tendo o primeiro contato com o Debian ou querem conhecer mais sobre determinados assuntos, e workshops/oficinas de n vel intermedi rio e avan ado para usu rios(as) do Debian que querem colocar a m o-na-massa durante o evento. No ltimo dia do evento, teremos um Hacking Day em um espa o de coworking para que todos possam interagir e de fato fazerem contribui es para o projeto. Inscri o A inscri o na MiniDebConf Bras lia 2023 totalmente gratuita e poder ser feita no formul rio dispon vel no site do evento. mandat ria a inscri o pr via para que todos(as) possam acessar a C mara dos Deputados - devido a quest es de seguran a da casa. Al m de auxiliar a organiza o no dimensionamento do evento. O evento organizado de forma volunt ria, e toda ajuda bem-vinda. Portanto, se voc gostaria de contribuir para a realiza o do evento, preencha o formul rio para inscri o de volunt rios. Os formul rios para inscri o (na MiniDebConf e para ajudar na organiza o) e submiss o de atividades ser o abertos no dia 1 de abril (n o tem nenhuma pegadinha, se inicia nesse dia mesmo :) Contato Para entrar em contato com o time local, a lista de emails debian-br-eventos, o canal IRC #debian-bsb no OFTC e o grupo no telegram DebianBras lia podem ser utilizados. Al m do endere o de email: brasil.mini@debconf.org.

19 March 2023

Russ Allbery: Review: Allow Me to Retort

Review: Allow Me to Retort, by Elie Mystal
Publisher: The New Press
Copyright: 2022
ISBN: 1-62097-690-0
Format: Kindle
Pages: 257
If you're familiar with Elie Mystal's previous work (writer for The Nation, previously editor for Above the Law, Twitter gadfly, and occasional talking head on news commentary programs), you'll have a good idea what to expect from this book: pointed liberal commentary, frequently developing into rants once he works up a head of steam. The subtitle of A Black Guy's Guide to the Constitution tells you that the topic is US constitutional law, which is very on brand. You're going to get succinct and uncompromising opinions at the intersection of law and politics. If you agree with them, you'll probably find them funny; if you disagree with them, you'll probably find them infuriating. In other words, Elie Mystal is the sort of writer one reads less for "huh, I disagreed with you but that's a good argument" and more for "yeah, you tell 'em, Elie!" I will be very surprised if this book changes anyone's mind about a significant political debate. I'm not sure if people who disagree are even in the intended audience. I'm leery of this sort of book. Usually its function is to feed confirmation bias with some witty rejoinders and put-downs that only sound persuasive to people who already agree with them. If I want that, I can just read Twitter (and you will be unsurprised to know that Mystal has nearly 500,000 Twitter followers). This style can also be boring at book length if the author is repeating variations on a theme. There is indeed a lot of that here, particularly in the first part of this book. If you don't generally agree with Mystal already, save yourself the annoyance and avoid this like the plague. It's just going to make you mad, and I don't think you're going to get anything useful out of it. But as I got deeper into this book, I think Mystal has another, more interesting purpose that's aimed at people who do largely agree. He's trying to undermine a very common US attitude (even on the left) about the US constitution. I don't know if most people from the US (particularly if they're white and male) realize quite how insufferably smug we tend to be about the US constitution. When you grow up here, the paeans to the constitution and the Founding Fathers (always capitalized like deities) are so ubiquitous and unremarked that it's difficult not to absorb them at a subconscious level. There is a national mythology about the greatness of our charter of government that crosses most political divides. In its modern form, this comes with some acknowledgment that some of its original provisions (the notorious three-fifths of a person clause, for instance) were bad, but we subsequently fixed them and everything is good now. Nearly everyone gets taught this in school, and it's almost never challenged. Even the edifices of the US left, such as the ACLU and the NAACP, tend to wrap themselves in the constitution. It's an enlightening experience to watch someone from the US corner a European with a discussion of the US constitution and watch the European plan escape routes while their soul attempts to leave their body. And I think it's telling that having that experience, as rare as it might be given how oblivious we can be, is still more common than a white person having a frank conversation with a black person in the US about the merits of the constitution as written. For various reasons, mostly because this is not very safe for the black person, this rarely happens. This book is primarily Mystal giving his opinion on various current controversies in constitutional law, but the underlying refrain is that the constitution is a trash document written by awful people that sets up a bad political system. That system has been aggressively defended by reactionary Supreme Courts, which along with the designed difficulty of the amendment process has prevented fixing many obviously broken parts. This in turn has led to numerous informal workarounds and elaborate "interpretations" to attempt to make the system vaguely functional. In other words, Mystal is trying to tell the US reader to stop being so precious about this specific document, and is using its truly egregious treatment of black people as the main fulcrum for his argument. Along the way, he gives an abbreviated tour of the highlights of constitutional law, but if you're at all interested in politics you've probably heard most of that before. The main point, I think, is to dig up any reverence left over from a US education, haul it out into the light of day, and compare it to the obvious failures of the constitution as a body of law and the moral failings of its authors. Mystal then asks exactly why we should care about original intent or be so reluctant to change the resulting system of government. (Did I mention you should not bother with this book if you don't agree with Mystal politically? Seriously, don't do that to yourself.) Readers of my reviews will know that I'm fairly far to the left politically, particularly by US standards, and yet I found it fascinating how much lingering reverence Mystal managed to dig out of me while reading this book. I found myself getting defensive in places, which is absurd because I didn't write this document. But I grew up surrounded by nigh-universal social signaling that the US constitution was the greatest political document ever, and in a religious tradition that often argued that it was divinely inspired. If one is exposed to enough of this, it becomes part of your background understanding of the world. Sometimes it takes someone being deliberately provocative to haul it back up to the surface where it can be examined. This book is not solely a psychological intervention in national mythology. Mystal gets into detailed legal arguments as well. I thought the most interesting was the argument that the bizarre and unconvincing "penumbras" and "emanations" reasoning in Griswold v. Connecticut (which later served as the basis of Roe v. Wade) was in part because the Lochner era Supreme Court had, in the course of trying to strike down all worker protection laws, abused the concept of substantive due process so badly that Douglas was unwilling to use it in the majority opinion and instead made up entirely new law. Mystal argues that the Supreme Court should have instead tackled the true meaning of substantive due process head-on and decided Griswold on 14th Amendment equal protection and substantive due process grounds. This is probably a well-known argument in legal circles, but I'd not run into it before (and Mystal makes it far more interesting and entertaining than my summary). Mystal also joins the tradition of thinking of the Reconstruction Amendments (the 13th, 14th, and 15th amendments passed after the Civil War) as a second revolution and an attempt to write a substantially new constitution on different legal principles, an attempt that subsequently failed in the face of concerted and deadly reactionary backlash. I first encountered this perspective via Jamelle Bouie, and it added a lot to my understanding of Reconstruction to see it as a political fight about the foundational principles of US government in addition to a fight over continuing racism in the US south. Maybe I was unusually ignorant of it (I know I need to read W.E.B. DuBois), but I think this line of reasoning doesn't get enough attention in popular media. Mystal provides a good introduction. But, that being said, Allow Me to Retort is more of a vibes book than an argument. As in his other writing, Mystal focuses on what he sees as the core of a controversy and doesn't sweat the details too much. I felt like he was less trying to convince me and more trying to model a different way of thinking and talking about constitutional law that isn't deferential to ideas that are not worthy of deference. He presents his own legal analysis and possible solutions to current US political challenges, but I don't think the specific policy proposals are the strong part of this book. The point, instead, is to embrace a vigorous politics based on a modern understanding of equality, democracy, and human rights, without a lingering reverence for people who mostly didn't believe in any of those things. The role of the constitution in that politics is a flawed tool rather than a sacred text. I think this book is best thought of as an internal argument in the US left. That argument is entirely within the frame of the US legal tradition, so if you're not in the US, it will be of academic interest at best (and probably not even that). If you're on the US right, Mystal offers lots of provocative pull quotes to enjoy getting outraged over, but he provides that service on Twitter for free. But if you are on the US left, I think Allow Me to Retort is worth more consideration than I'd originally given it. There's something here about how we engage with our legal history, and while Mystal's approach is messy, maybe that's the only way you can get at something that's more emotion than logic. In some places it degenerates into a Twitter rant, but Mystal is usually entertaining even when he's ranting. I'm not sorry I read it. Rating: 7 out of 10

11 March 2023

Dirk Eddelbuettel: pkgKitten 0.2.3 on CRAN: Minor Update

kitten A new release 0.2.3 of pkgKitten arrived on CRAN earlier, and will be uploaded to Debian. pkgKitten makes it simple to create new R packages via a simple function invocation. A wrapper kitten.r exists in the littler package to make it even easier. This release improves the created Description: , and updated some of the continuous integration.

Changes in version 0.2.3 (2023-03-11)
  • Small improvement to generated Description: field and Title:
  • Maintenance for continuous integration setup

More details about the package are at the pkgKitten webpage, the pkgKitten docs site, and the pkgKitten GitHub repo. Courtesy of my CRANberries site, there is also a diffstat report for this release. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

10 February 2023

Jonathan Dowland: HLedger, 1 year on

It's been a year since I started exploring HLedger, and I'm still going. The rollover to 2023 was an opportunity to revisit my approach. Some time ago I stumbled across Dmitry Astapov's HLedger notes (fully-fledged hledger, which I briefly mentioned in eventual consistency) and decided to adopt some of its ideas. new year, new journal First up, Astapov encourages starting a new journal file for a new calendar year. I do this for other, accounting-adjacent files as a matter of course, and I did it for my GNUCash files prior to adopting HLedger. But the reason for those is a general suspicion that a simple mistake with those softwares could irrevocably corrupt my data. I'm much more confident with HLedger, so rolling over at years end isn't necessary for that. But there are other advantages. A quick obvious one is you can get rid of old accounts (such as expense accounts tied to a particular project, now completed). one journal per import In the first year, I periodically imported account data via CSV exports of transactions and HLedger's (excellent) CSV import system. I imported all the transactions, once each, into a single, large journal file. Astapov instead advocates for creating a separate journal for each CSV that you wish to import, and keep around the CSV, leaving you with a 1:1 mapping of CSV:journal. Then use HLedger's "include" mechanism to pull them all into the main journal. With the former approach, where the CSV data was imported precisely, once, it was only exposed to your import rules once. The workflow ended up being: import transactions; notice some that you could have matched with import rules and auto-coded; write the rule for the next time. With Astapov's approach, you can re-generate the journal from the CSV at any point in the future with an updated set of import rules. tracking dependencies Now we get onto the job of driving the generation of all these derivative journal files. Astapov has built a sophisticated system using Haskell's "Shake", which I'm not yet familiar, but for my sins I'm quite adept at (GNU-flavoured) UNIX Make, so I started building with that. An example rule
import/jon/amex/%.journal: import/jon/amex/%.csv rules/amex.csv.rules
    rm -f $(@D)/.latest.$*.csv $@
    hledger import --rules-file rules/amex.csv.rules -f $@ $<
This captures the dependency between the journal and the underlying CSV but also to the relevant rules file; if I modify that, and this target is run in the future, all dependent journals should be re-generated.1 opening balances It's all fine and well starting over in a new year, and I might be generous to forgive debts, but I can't count on others to do the same. We need to carry over some balance information from one year to the next. Astapov has a more complex (or perhaps featureful) scheme for this involving a custom Haskell program, but I bodged something with a pair of make targets:
import/opening/2023.csv: 2022.journal
    mkdir -p import/opening
    hledger bal -f $< \
                $(list_of_accounts_I_want_to_carry_over) \
        -O csv -N > $@
import/opening/2023.journal: import/opening/2023.csv rules/opening.rules
    rm -f $(@D)/.latest.2023.csv $@
    hledger import --rules-file rules/opening.rules \
        -f $@ $<
I think this could be golfed into a year-generic rule with a little more work. The nice thing about this approach is the opening balances for a given year might change, if adjustments are made in prior years. They shouldn't, for real accounts, but very well could for more "virtual" liabilities. (including: deciding to write off debts.) run lots of reports Astapov advocates for running lots of reports, and automatically. There's a really obvious advantage of that to me: there's no chance anyone except me will actually interact with HLedger itself. For family finances, I need reports to be able to discuss anything with my wife. Extending my make rules to run reports is trivial. I've gone for HTML reports for the most part, as they're the easiest on the eye. Unfortunately the most useful report to discuss (at least at the moment) would be a list of transactions in a given expense category, and the register/aregister commands did not support HTML as an output format. I submitted my first HLedger patch to add HTML output support to aregister: https://github.com/simonmichael/hledger/pull/2000 addressing the virtual posting problem I wrote in my original hledger blog post that I had to resort to unbalanced virtual postings in order to record both a liability between my personal cash and family, as well as categorise the spend. I still haven't found a nice way around that. But I suspect having broken out the journal into lots of other journals paves the way to a better solution to the above. The form of a solution I am thinking of is: some scheme whereby the two destination accounts are combined together; perhaps, choose one as a primary and encode the other information in sub-accounts under that. For example, repeating the example from my hledger blog post:
2022-01-02 ZTL*RELISH
    family:liabilities:creditcard        -3.00
    family:dues:jon                       3.00
    (jon:expenses:snacks)                 3.00
This could become
2022-01-02 ZTL*RELISH
    family:liabilities:creditcard        -3.00
    family:liabilities:jon:snacks
(I note this is very similar to a solution proposed to me by someone responding on twitter). The next step is to recognise that sometimes when looking at the data I care about one aspect, and at other times the other, but rarely both. So for the case where I'm thinking about family finances, I could use account aliases to effectively flatten out the expense category portion and ignore it. On the other hand, when I'm concerned about how I've spent my personal cash and not about how much I owe the family account, I could use aliases to do the opposite: rewrite-away the family:liabilities:jon prefix and combine the transactions with the regular jon:expenses account heirarchy. (this is all speculative: I need to actually try this.) catching errors after an import When I import the transactions for a given real bank account, I check the final balance against another source: usually a bank statement, to make sure they agree. I wasn't using any of the myriad methods to make sure that this remains true later on, and so there was the risk that I make an edit to something and accidentally remove a transaction that contributed to that number, and not notice (until the next import). The CSV data my bank gives me for accounts (not for credit cards) also includes a 'resulting balance' field. It was therefore trivial to extend the CSV import rules to add balance assertions to the transactions that are generated. This catches the problem. There are a couple of warts with balance assertions on every such transaction: for example, dealing with the duplicate transaction for paying a credit card: one from the bank statement, one from the credit card. Removing one of the two is sufficient to correct the account balances but sometimes they don't agree on the transaction date, or the transactions within a given day are sorted slightly differently by HLedger than by the bank. The simple solution is to just manually delete one or two assertions: there remain plenty more for assurance. going forward I've only scratched the surface of the suggestions in Astapov's "full fledged HLedger" notes. I'm up to step 2 of 14. I'm expecting to return to it once the changes I've made have bedded in a little bit. I suppose I could anonymize and share the framework (Makefile etc) that I am using if anyone was interested. It would take some work, though, so I don't know when I'd get around to it.

  1. the rm latest bit is to clear up some state-tracking files that HLedger writes to avoid importing duplicate transactions. In this case, I know better than HLedger.

8 February 2023

Chris Lamb: Most anticipated films of 2023

Very few highly-anticipated movies appear in January and February, as the bigger releases are timed so they can be considered for the Golden Globes in January and the Oscars in late February or early March, so film fans have the advantage of a few weeks after the New Year to collect their thoughts on the year ahead. In other words, I'm not actually late in outlining below the films I'm most looking forward to in 2023...

Barbie No, seriously! If anyone can make a good film about a doll franchise, it's probably Greta Gerwig. Not only was Little Women (2019) more than admirable, the same could be definitely said for Lady Bird (2017). More importantly, I can't help feel she was the real 'Driver' behind Frances Ha (2012), one of the better modern takes on Claudia Weill's revelatory Girlfriends (1978). Still, whenever I remember that Barbie will be a film about a billion-dollar toy and media franchise with a nettlesome history, I recall I rubbished the "Facebook film" that turned into The Social Network (2010). Anyway, the trailer for Barbie is worth watching, if only because it seems like a parody of itself.

Blitz It's difficult to overstate just how important the aerial bombing of London during World War II is crucial to understanding the British psyche, despite it being a constructed phenomenon from the outset. Without wishing to underplay the deaths of over 40,000 civilian deaths, Angus Calder pointed out in the 1990s that the modern mythology surrounding the event "did not evolve spontaneously; it was a propaganda construct directed as much at [then neutral] American opinion as at British." It will therefore be interesting to see how British Grenadian Trinidadian director Steve McQueen addresses a topic so essential to the British self-conception. (Remember the controversy in right-wing circles about the sole Indian soldier in Christopher Nolan's Dunkirk (2017)?) McQueen is perhaps best known for his 12 Years a Slave (2013), but he recently directed a six-part film anthology for the BBC which addressed the realities of post-Empire immigration to Britain, and this leads me to suspect he sees the Blitz and its surrounding mythology with a more critical perspective. But any attempt to complicate the story of World War II will be vigorously opposed in a way that will make the recent hullabaloo surrounding The Crown seem tame. All this is to say that the discourse surrounding this release may be as interesting as the film itself.

Dune, Part II Coming out of the cinema after the first part of Denis Vileneve's adaptation of Dune (2021), I was struck by the conception that it was less of a fresh adaptation of the 1965 novel by Frank Herbert than an attempt to rehabilitate David Lynch's 1984 version and in a broader sense, it was also an attempt to reestablish the primacy of cinema over streaming TV and the myriad of other distractions in our lives. I must admit I'm not a huge fan of the original novel, finding within it a certain prurience regarding hereditary military regimes and writing about them with a certain sense of glee that belies a secret admiration for them... not to mention an eyebrow-raising allegory for the Middle East. Still, Dune, Part II is going to be a fantastic spectacle.

Ferrari It'll be curious to see how this differs substantially from the recent Ford v Ferrari (2019), but given that Michael Mann's Heat (1995) so effectively re-energised the gangster/heist genre, I'm more than willing to kick the tires of this about the founder of the eponymous car manufacturer. I'm in the minority for preferring Mann's Thief (1981) over Heat, in part because the former deals in more abstract themes, so I'd have perhaps prefered to look forward to a more conceptual film from Mann over a story about one specific guy.

How Do You Live There are a few directors one can look forward to watching almost without qualification, and Hayao Miyazaki (My Neighbor Totoro, Kiki's Delivery Service, Princess Mononoke Howl's Moving Castle, etc.) is one of them. And this is especially so given that The Wind Rises (2013) was meant to be the last collaboration between Miyazaki and Studio Ghibli. Let's hope he is able to come out of retirement in another ten years.

Indiana Jones and the Dial of Destiny Given I had a strong dislike of Indiana Jones and the Kingdom of the Crystal Skull (2008), I seriously doubt I will enjoy anything this film has to show me, but with 1981's Raiders of the Lost Ark remaining one of my most treasured films (read my brief homage), I still feel a strong sense of obligation towards the Indiana Jones name, despite it feeling like the copper is being pulled out of the walls of this franchise today.

Kafka I only know Polish filmmaker Agnieszka Holland through her Spoor (2017), an adaptation of Olga Tokarczuk's 2009 eco-crime novel Drive Your Plow Over the Bones of the Dead. I wasn't an unqualified fan of Spoor (nor the book on which it is based), but I am interested in Holland's take on the life of Czech author Franz Kafka, an author enmeshed with twentieth-century art and philosophy, especially that of central Europe. Holland has mentioned she intends to tell the story "as a kind of collage," and I can hope that it is an adventurous take on the over-furrowed biopic genre. Or perhaps Gregor Samsa will awake from uneasy dreams to find himself transformed in his bed into a huge verminous biopic.

The Killer It'll be interesting to see what path David Fincher is taking today, especially after his puzzling and strangely cold Mank (2020) portraying the writing process behind Orson Welles' Citizen Kane (1941). The Killer is said to be a straight-to-Netflix thriller based on the graphic novel about a hired assassin, which makes me think of Fincher's Zodiac (2007), and, of course, Se7en (1995). I'm not as entranced by Fincher as I used to be, but any film with Michael Fassbender and Tilda Swinton (with a score by Trent Reznor) is always going to get my attention.

Killers of the Flower Moon In Killers of the Flower Moon, Martin Scorsese directs an adaptation of a book about the FBI's investigation into a conspiracy to murder Osage tribe members in the early years of the twentieth century in order to deprive them of their oil-rich land. (The only thing more quintessentially American than apple pie is a conspiracy combined with a genocide.) Separate from learning more about this disquieting chapter of American history, I'd love to discover what attracted Scorsese to this particular story: he's one of the few top-level directors who have the ability to lucidly articulate their intentions and motivations.

Napoleon It often strikes me that, despite all of his achievements and fame, it's somehow still possible to claim that Ridley Scott is relatively underrated compared to other directors working at the top level today. Besides that, though, I'm especially interested in this film, not least of all because I just read Tolstoy's War and Peace (read my recent review) and am working my way through the mind-boggling 431-minute Soviet TV adaptation, but also because several auteur filmmakers (including Stanley Kubrick) have tried to make a Napoleon epic and failed.

Oppenheimer In a way, a biopic about the scientist responsible for the atomic bomb and the Manhattan Project seems almost perfect material for Christopher Nolan. He can certainly rely on stars to queue up to be in his movies (Robert Downey Jr., Matt Damon, Kenneth Branagh, etc.), but whilst I'm certain it will be entertaining on many fronts, I fear it will fall into the well-established Nolan mould of yet another single man struggling with obsession, deception and guilt who is trying in vain to balance order and chaos in the world.

The Way of the Wind Marked by philosophical and spiritual overtones, all of Terrence Malick's films are perfumed with themes of transcendence, nature and the inevitable conflict between instinct and reason. My particular favourite is his stunning Days of Heaven (1978), but The Thin Red Line (1998) and A Hidden Life (2019) also touched me ways difficult to relate, and are one of the few films about the Second World War that don't touch off my sensitivity about them (see my remarks about Blitz above). It is therefore somewhat Malickian that his next film will be a biblical drama about the life of Jesus. Given Malick's filmography, I suspect this will be far more subdued than William Wyler's 1959 Ben-Hur and significantly more equivocal in its conviction compared to Paolo Pasolini's ardently progressive The Gospel According to St. Matthew (1964). However, little beyond that can be guessed, and the film may not even appear until 2024 or even 2025.

Zone of Interest I was mesmerised by Jonathan Glazer's Under the Skin (2013), and there is much to admire in his borderline 'revisionist gangster' film Sexy Beast (2000), so I will definitely be on the lookout for this one. The only thing making me hesitate is that Zone of Interest is based on a book by Martin Amis about a romance set inside the Auschwitz concentration camp. I haven't read the book, but Amis has something of a history in his grappling with the history of the twentieth century, and he seems to do it in a way that never sits right with me. But if Paul Verhoeven's Starship Troopers (1997) proves anything at all, it's all in the adaption.

1 February 2023

Julian Andres Klode: Ubuntu 2022v1 secure boot key rotation and friends

This is the story of the currently progressing changes to secure boot on Ubuntu and the history of how we got to where we are.

taking a step back: how does secure boot on Ubuntu work? Booting on Ubuntu involves three components after the firmware:
  1. shim
  2. grub
  3. linux
Each of these is a PE binary signed with a key. The shim is signed by Microsoft s 3rd party key and embeds a self-signed Canonical CA certificate, and optionally a vendor dbx (a list of revoked certificates or binaries). grub and linux (and fwupd) are then signed by a certificate issued by that CA In Ubuntu s case, the CA certificate is sharded: Multiple people each have a part of the key and they need to meet to be able to combine it and sign things, such as new code signing certificates.

BootHole When BootHole happened in 2020, travel was suspended and we hence could not rotate to a new signing certificate. So when it came to updating our shim for the CVEs, we had to revoke all previously signed kernels, grubs, shims, fwupds by their hashes. This generated a very large vendor dbx which caused lots of issues as shim exported them to a UEFI variable, and not everyone had enough space for such large variables. Sigh. We decided we want to rotate our signing key next time. This was also when upstream added SBAT metadata to shim and grub. This gives a simple versioning scheme for security updates and easy revocation using a simple EFI variable that shim writes to and reads from.

Spring 2022 CVEs We still were not ready for travel in 2021, but during BootHole we developed the SBAT mechanism, so one could revoke a grub or shim by setting a single EFI variable. We actually missed rotating the shim this cycle as a new vulnerability was reported immediately after it, and we decided to hold on to it.

2022 key rotation and the fall CVEs This caused some problems when the 2nd CVE round came, as we did not have a shim with the latest SBAT level, and neither did a lot of others, so we ended up deciding upstream to not bump the shim SBAT requirements just yet. Sigh. Anyway, in October we were meeting again for the first time at a Canonical sprint, and the shardholders got together and created three new signing keys: 2022v1, 2022v2, and 2022v3. It took us until January before they were installed into the signing service and PPAs setup to sign with them. We also submitted a shim 15.7 with the old keys revoked which came back at around the same time. Now we were in a hurry. The 22.04.2 point release was scheduled for around middle of February, and we had nothing signed with the new keys yet, but our new shim which we need for the point release (so the point release media remains bootable after the next round of CVEs), required new keys. So how do we ensure that users have kernels, grubs, and fwupd signed with the new key before we install the new shim?

upgrade ordering grub and fwupd are simple cases: For grub, we depend on the new version. We decided to backport grub 2.06 to all releases (which moved focal and bionic up from 2.04), and kept the versioning of the -signed packages the same across all releases, so we were able to simply bump the Depends for grub to specify the new minimum version. For fwupd-efi, we added Breaks. (Actually, we also had a backport of the CVEs for 2.04 based grub, and we did publish that for 20.04 signed with the old keys before backporting 2.06 to it.) Kernels are a different story: There are about 60 kernels out there. My initial idea was that we could just add Breaks for all of them. So our meta package linux-image-generic which depends on linux-image-$(uname -r)-generic, we d simply add Breaks: linux-image-generic ( 5.19.0-31) and then adjust those breaks for each series. This would have been super annoying, but ultimately I figured this would be the safest option. This however caused concern, because it could be that apt decides to remove the kernel metapackage. I explored checking the kernels at runtime and aborting if we don t have a trusted kernel in preinst. This ensures that if you try to upgrade shim without having a kernel, it would fail to install. But this ultimately has a couple of issues:
  1. It aborts the entire transaction at that point, so users will be unable to run apt upgrade until they have a recent kernel.
  2. We cannot even guarantee that a kernel would be unpacked first. So even if you got a new kernel, apt/dpkg might attempt to unpack it first and then the preinst would fail because no kernel is present yet.
Ultimately we believed the danger to be too large given that no kernels had yet been released to users. If we had kernels pushed out for 1-2 months already, this would have been a viable choice. So in the end, I ended up modifying the shim packaging to install both the latest shim and the previous one, and an update-alternatives alternative to select between the two: In it s post-installation maintainer script, shim-signed checks whether all kernels with a version greater or equal to the running one are not revoked, and if so, it will setup the latest alternative with priority 100 and the previous with a priority of 50. If one or more of those kernels was signed with a revoked key, it will swap the priorities around, so that the previous version is preferred. Now this is fairly static, and we do want you to switch to the latest shim eventually, so I also added hooks to the kernel install to trigger the shim-signed postinst script when a new kernel is being installed. It will then update the alternatives based on the current set of kernels, and if it now points to the latest shim, reinstall shim and grub to the ESP. Ultimately this means that once you install your 2nd non-revoked kernel, or you install a non-revoked kernel and then reconfigure shim or the kernel, you will get the latest shim. When you install your first non-revoked kernel, your currently booted kernel is still revoked, so it s not upgraded immediately. This has a benefit in that you will most likely have two kernels you can boot without disabling secure boot.

regressions Of course, the first version I uploaded had still some remaining hardcoded shimx64 in the scripts and so failed to install on arm64 where shimaa64 is used. And if that were not enough, I also forgot to include support for gzip compressed kernels there. Sigh, I need better testing infrastructure to be able to easily run arm64 tests as well (I only tested the actual booting there, not the scripts). shim-signed migrated to the release pocket in lunar fairly quickly, but this caused images to stop working, because the new shim was installed into images, but no kernel was available yet, so we had to demote it to proposed and block migration. Despite all the work done for end users, we need to be careful to roll this out for image building.

another grub update for OOM issues. We had two grubs to release: First there was the security update for the recent set of CVEs, then there also was an OOM issue for large initrds which was blocking critical OEM work. We fixed the OOM issue by cherry-picking all 2.12 memory management patches, as well as the red hat patches to the loader we take from there. This ended up a fairly large patch set and I was hesitant to tie the security update to that, so I ended up pushing the security update everywhere first, and then pushed the OOM fixes this week. With the OOM patches, you should be able to boot initrds of between 400M and 1GB, it also depends on the memory layout of your machine and your screen resolution and background images. So OEM team had success testing 400MB irl, and I tested up to I think it was 1.2GB in qemu, I ran out of FAT space then and stopped going higher :D

other features in this round
  • Intel TDX support in grub and shim
  • Kernels are allocated as CODE now not DATA as per the upstream mm changes, might fix boot on X13s

am I using this yet? The new signing keys are used in:
  • shim-signed 1.54 on 22.10+, 1.51.3 on 22.04, 1.40.9 on 20.04, 1.37~18.04.13 on 18.04
  • grub2-signed 1.187.2~ or newer (binary packages grub-efi-amd64-signed or grub-efi-arm64-signed), 1.192 on 23.04.
  • fwupd-signed 1.51~ or newer
  • various linux updates. Check apt changelog linux-image-unsigned-$(uname -r) to see if Revoke & rotate to new signing key (LP: #2002812) is mentioned in there to see if it signed with the new key.
If you were able to install shim-signed, your grub and fwupd-efi will have the correct version as that is ensured by packaging. However your shim may still point to the old one. To check which shim will be used by grub-install, you can check the status of the shimx64.efi.signed or (on arm64) shimaa64.efi.signed alternative. The best link needs to point to the file ending in latest:
$ update-alternatives --display shimx64.efi.signed
shimx64.efi.signed - auto mode
  link best version is /usr/lib/shim/shimx64.efi.signed.latest
  link currently points to /usr/lib/shim/shimx64.efi.signed.latest
  link shimx64.efi.signed is /usr/lib/shim/shimx64.efi.signed
/usr/lib/shim/shimx64.efi.signed.latest - priority 100
/usr/lib/shim/shimx64.efi.signed.previous - priority 50
If it does not, but you have installed a new kernel compatible with the new shim, you can switch immediately to the new shim after rebooting into the kernel by running dpkg-reconfigure shim-signed. You ll see in the output if the shim was updated, or you can check the output of update-alternatives as you did above after the reconfiguration has finished. For the out of memory issues in grub, you need grub2-signed 1.187.3~ (same binaries as above).

how do I test this (while it s in proposed)?
  1. upgrade your kernel to proposed and reboot into that
  2. upgrade your grub-efi-amd64-signed, shim-signed, fwupd-signed to proposed.
If you already upgraded your shim before your kernel, don t worry:
  1. upgrade your kernel and reboot
  2. run dpkg-reconfigure shim-signed
And you ll be all good to go.

deep dive: uploading signed boot assets to Ubuntu For each signed boot asset, we build one version in the latest stable release and the development release. We then binary copy the built binaries from the latest stable release to older stable releases. This process ensures two things: We know the next stable release is able to build the assets and we also minimize the number of signed assets. OK, I lied. For shim, we actually do not build in the development release but copy the binaries upward from the latest stable, as each shim needs to go through external signing. The entire workflow looks something like this:
  1. Upload the unsigned package to one of the following build PPAs:
  2. Upload the signed package to the same PPA
  3. For stable release uploads:
    • Copy the unsigned package back across all stable releases in the PPA
    • Upload the signed package for stable releases to the same PPA with ~<release>.1 appended to the version
  4. Submit a request to canonical-signing-jobs to sign the uploads. The signing job helper copies the binary -unsigned packages to the primary-2022v1 PPA where they are signed, creating a signing tarball, then it copies the source package for the -signed package to the same PPA which then downloads the signing tarball during build and places the signed assets into the -signed deb. Resulting binaries will be placed into the proposed PPA: https://launchpad.net/~ubuntu-uefi-team/+archive/ubuntu/proposed
  5. Review the binaries themselves
  6. Unembargo and binary copy the binaries from the proposed PPA to the proposed-public PPA: https://launchpad.net/~ubuntu-uefi-team/+archive/ubuntu/proposed-public. This step is not strictly necessary, but it enables tools like sru-review to work, as they cannot access the packages from the normal private proposed PPA.
  7. Binary copy from proposed-public to the proposed queue(s) in the primary archive
Lots of steps!

WIP As of writing, only the grub updates have been released, other updates are still being verified in proposed. An update for fwupd in bionic will be issued at a later point, removing the EFI bits from the fwupd 1.2 packaging and using the separate fwupd-efi project instead like later release series.

26 January 2023

B lint R czey: How to speed up your next build 5-20x with Firebuild?

Firebuild logo
TL;DR: Just prefix your build command (or any command) with firebuild:
firebuild <build command>
OK, but how does it work? Firebuild intercepts all processes started by the command to cache their outputs. Next time when the command or any of its descendant commands is executed with the same parameters, inputs and environment, the outputs are replayed (the command is shortcut) from the cache instead of running the command again. This is similar to how ccache and other compiler-specific caches work, but firebuild can shortcut any deterministic command, not only a specific list of compilers. Since the inputs of each command is determined at run time firebuild does not need a maintained complete dependency graph in the source like Bazel. It can work with any build system that does not implement its own caching mechanism. Determinism of commands is detected at run-time by preloading libfirebuild.so and interposing standard library calls and syscalls. If the command and all its descendants inputs are available when the command starts and all outputs can be calculated from the inputs then the command can be shortcut, otherwise it will be executed again. The interception comes with a 5-10% overhead, but rebuilds can be 5-20 times, or even faster depending on the changes between the builds. Can I try it? It is already available in Debian Unstable and Testing, Ubuntu s development release and the latest stable version is back-ported to supported Ubuntu releases via a PPA. How can I analyze my builds with firebuild? Firebuild can generate an HTML report showing each command s contribution to the build time. Below are the before and after reports of json4s, a Scala project. The command call graphs (lower ones) show that java (scalac) took 99% of the original build. Since the scalac invocations are shortcut (cutting the second build s time to less than 2% of the first one) they don t even show up in the accelerated second build s call graph. What s left to be executed again in the second run are env, perl, make and a few simple commands. The upper graphs are the process trees, with expandable nodes (in blue) also showing which command invocations were shortcut (green). Clicking on a node shows details of the command and the reason if it was not shortcut. Could I accelerate my project more? Firebuild works best for builds with CPU-intensive processes and comes with defaults to not cache very quick commands, such as sh, grep, sed, etc., because caching those would take cache space and shortcutting them may not speed up the build that much. They can still be shortcut with their parent command. Firebuild s strength is that it can find shortcutting points in the process tree automatically, e.g. from sh -c 'bash -c "sh -c echo Hello World!"' bash would be shortcut, but none of the sh commands would be cached. In typical builds there are many such commands from the skip_cache list. Caching those commands with firebuild -o 'processes.skip_cache = []' can improve acceleration and make the reports smaller. Firebuild also supports several debug flags and -d proc helps finding reasons for not shortcutting some commands:
...
FIREBUILD: Command "/usr/bin/make" can't be short-cut due to: Executable set to be not shortcut,  ExecedProcess 1329.2, running, "make -f debian/rules build", fds=[ FileFD fd=0  FileOFD ...
FIREBUILD: Command "/usr/bin/sort" can't be short-cut due to: Process read from inherited fd ,  ExecedProcess 4161.1, running, "sort", fds=[ FileFD fd=0  FileOFD ...
FIREBUILD: Command "/usr/bin/find" can't be short-cut due to: fstatfs() family operating on fds is not supported,  ExecedProcess 1360.1, running, "find -mindepth 1 ...
...
make, ninja and other incremental build tool binaries are not shortcut because they compare the timestamp of files, but they are fast at least and every build step they perform can still be shortcut. Ideally the slower build steps that could not be shortcut can be re-implemented in ways that can be shortcut by avoiding tools performing unsupported operations. I hope those tools help speeding up your build with very little effort, but if not and you find something to fix or improve in firebuild itself, please report it or just leave a feedback! Happy speeding, but not on public roads!

18 January 2023

Russ Allbery: Review: Forward

Review: Forward, edited by Blake Crouch
Publisher: Amazon Original Stories
Copyright: September 2019
ISBN: 1-5420-9206-X
ISBN: 1-5420-4363-8
ISBN: 1-5420-9357-0
ISBN: 1-5420-0434-9
ISBN: 1-5420-4363-8
ISBN: 1-5420-4425-1
Format: Kindle
Pages: 300
This is another Amazon collection of short fiction, this time mostly at novelette length. (The longer ones might creep into novella.) As before, each one is available separately for purchase or Amazon Prime "borrowing," with separate ISBNs. The sidebar cover is for the first in the sequence. (At some point I need to update my page templates so that I can add multiple covers.) N.K. Jemisin's "Emergency Skin" won the 2020 Hugo Award for Best Novelette, so I wanted to read and review it, but it would be too short for a standalone review. I therefore decided to read the whole collection and review it as an anthology. This was a mistake. Learn from my mistake. The overall theme of the collection is technological advance, rapid change, and the ethical and social question of whether we should slow technology because of social risk. Some of the stories stick to that theme more closely than others. Jemisin's story mostly ignores it, which was probably the right decision. "Ark" by Veronica Roth: A planet-killing asteroid has been on its inexorable way towards Earth for decades. Most of the planet has been evacuated. A small group has stayed behind, cataloging samples and filling two remaining ships with as much biodiversity as they can find with the intent to leave at the last minute. Against that backdrop, two of that team bond over orchids. If you were going "wait, what?" about the successful evacuation of Earth, yeah, me too. No hint is offered as to how this was accomplished. Also, the entirety of humanity abandoned mutual hostility and national borders to cooperate in the face of the incoming disaster, which is, uh, bizarrely optimistic for an otherwise gloomy story. I should be careful about how negative I am about this story because I am sure it will be someone's favorite. I can even write part of the positive review: an elegiac look at loss, choices, and the meaning of a life, a moving look at how people cope with despair. The writing is fine, the story structure works; it's not a bad story. I just found it monumentally depressing, and was not engrossed by the emotionally abused protagonist's unresolved father issues. I can imagine a story around the same facts and plot that I would have liked much better, but all of these people need therapy and better coping mechanisms. I'm also not sure what this had to do with the theme, given that the incoming asteroid is random chance and has nothing to do with technological development. (4) "Summer Frost" by Blake Crouch: The best part of this story is the introductory sequence before the reader knows what's going on, which is full of evocative descriptions. I'm about to spoil what is going on, so if you want to enjoy that untainted by the stupidity of the rest of the plot, skip the rest of this story review. We're going to have a glut of stories about the weird and obsessive form of AI risk invented by the fevered imaginations of the "rationalist" community, aren't we. I don't know why I didn't predict that. It's going to be just as annoying as the glut of cyberpunk novels written by people who don't understand computers. Crouch lost me as soon as the setup is revealed. Even if I believed that a game company would use a deep learning AI still in learning mode to run an NPC (I don't; see Microsoft's Tay for an obvious reason why not), or that such an NPC would spontaneously start testing the boundaries of the game world (this is not how deep learning works), Crouch asks the reader to believe that this AI started as a fully scripted NPC in the prologue with a fixed storyline. In other words, the foundation of the story is that this game company used an AI model capable of becoming a general intelligence for barely more than a cut scene. This is not how anything works. The rest of the story is yet another variation on a science fiction plot so old and threadbare that Isaac Asimov invented the Three Laws of Robotics to avoid telling more versions of it. Crouch's contribution is to dress it up in the terminology of the excessively online. (The middle of the story features a detailed discussion of Roko's basilisk; if you recognize that, you know what you're in for.) Asimov would not have had a lesbian protagonist, so points for progress I guess, but the AI becomes more interesting to the protagonist than her wife and kid because of course it does. There are a few twists and turns along the way, but the destination is the bog-standard hard-takeoff general intelligence scenario. One more pet peeve: Authors, stop trying to illustrate the growth of your AI by having it move from broken to fluent English. English grammar is so much easier than self-awareness or the Turing test that we had programs that could critique your grammar decades before we had believable chatbots. It's going to get grammar right long before the content of the words makes any sense. Also, your AI doesn't sound dumber, your AI sounds like someone whose native language doesn't use pronouns and helper verbs the way that English does, and your decision to use that as a marker for intelligence is, uh, maybe something you should think about. (3) "Emergency Skin" by N.K. Jemisin: The protagonist is a heavily-augmented cyborg from a colony of Earth's diaspora. The founders of that colony fled Earth when it became obvious to them that the planet was dying. They have survived in another star system, but they need a specific piece of technology from the dead remnants of Earth. The protagonist has been sent to retrieve it. The twist is that this story is told in the second-person perspective by the protagonist's ride-along AI, created from a consensus model of the rulers of the colony. We never see directly what the protagonist is doing or thinking, only the AI reaction to it. This is exactly the sort of gimmick that works much better in short fiction than at novel length. Jemisin uses it to create tension between the reader and the narrator, and I thoroughly enjoyed the effect. (As shown in the Broken Earth trilogy, Jemisin is one of the few writers who can use second-person effectively.) I won't spoil the revelation, but it's barbed and biting and vicious and I loved it. Jemisin does deliver the point with a sledgehammer, so be aware of that if you want subtlety in your short fiction, but I prefer the bluntness. (This is part of why I usually don't get along with literary short stories.) The reader of course can't change the direction of the story, but the second-person perspective still provides a hit of vicarious satisfaction. I can see why this won the Hugo; it's worth seeking out. (8) "You Have Arrived at Your Destination" by Amor Towles: Sam and his wife are having a child, and they've decided to provide him with an early boost in life. Vitek is a fertility lab, but more than that, it can do some gene tweaking and adjustment to push a child more towards one personality or another. Sam and his wife have spent hours filling out profiles, and his wife spent hours weeding possible choices down to three. Now, Sam has come to Vitek to pick from the remaining options. Speaking of literary short stories, Towles is the non-SFF writer of this bunch, and it's immediately obvious. The story requires the SFnal premise, but after that this is a character piece. Vitek is an elite, expensive company with a condescending and overly-reductive attitude towards humanity, which is entirely intentional on the author's part. This is the sort of story that gets resolved in an unexpected conversation in a roadside bar, and where most of the conflict happens inside the protagonist's head. I was initially going to complain that Towles does the standard literary thing of leaving off the denouement on the grounds that the reader can figure it out, but when I did a bit of re-reading for this review, I found more of the bones than I had noticed the first time. There's enough subtlety that I had to think for a bit and re-read a passage, but not too much. It's also the most thoughtful treatment of the theme of the collection, the only one that I thought truly wrestled with the weird interactions between technological capability and human foresight. Next to "Emergency Skin," this was the best story of the collection. (7) "The Last Conversation" by Paul Tremblay: A man wakes up in a dark room, in considerable pain, not remembering anything about his life. His only contact with the world at first is a voice: a woman who is helping him recover his strength and his memory. The numbers that head the chapters have significant gaps, representing days left out of the story, as he pieces together what has happened alongside the reader. Tremblay is the horror writer of the collection, so predictably this is the story whose craft I can admire without really liking it. In this case, the horror comes mostly from the pacing of revelation, created by the choice of point of view. (This would be a much different story from the perspective of the woman.) It's well-done, but it has the tendency I've noticed in other horror stories of being a tightly closed system. I see where the connection to the theme is, but it's entirely in the setting, not in the shape of the story. Not my thing, but I can see why it might be someone else's. (5) "Randomize" by Andy Weir: Gah, this was so bad. First, and somewhat expectedly, it's a clunky throwback to a 1950s-style hard SF puzzle story. The writing is atrocious: wooden, awkward, cliched, and full of gratuitous infodumping. The characterization is almost entirely through broad stereotypes; the lone exception is the female character, who at least adds an interesting twist despite being forced to act like an idiot because of the plot. It's a very old-school type of single-twist story, but the ending is completely implausible and falls apart if you breathe on it too hard. Weir is something of a throwback to an earlier era of scientific puzzle stories, though, so maybe one is inclined to give him a break on the writing quality. (I am not; one of the ways in which science fiction has improved is that you can get good scientific puzzles and good writing these days.) But the science is also so bad that I was literally facepalming while reading it. The premise of this story is that quantum computers are commercially available. That will cause a serious problem for Las Vegas casinos, because the generator for keno numbers is vulnerable to quantum algorithms. The solution proposed by the IT person for the casino? A quantum random number generator. (The words "fight quantum with quantum" appear literally in the text if you're wondering how bad the writing is.) You could convince me that an ancient keno system is using a pseudorandom number generator that might be vulnerable to some quantum algorithm and doesn't get reseeded often enough. Fine. And yes, quantum computers can be used to generate high-quality sources of random numbers. But this solution to the problem makes no sense whatsoever. It's like swatting a house fly with a nuclear weapon. Weir says explicitly in the story that all the keno system needs is an external source of high-quality random numbers. The next step is to go to Amazon and buy a hardware random number generator. If you want to splurge, it might cost you $250. Problem solved. Yes, hardware random number generators have various limitations that may cause you problems if you need millions of bits or you need them very quickly, but not for something as dead-simple and with such low entropy requirements as keno numbers! You need a trivial number of bits for each round; even the slowest and most conservative hardware random number generator would be fine. Hell, measure the noise levels on the casino floor. Point a camera at a lava lamp. Or just buy one of the physical ball machines they use for the lottery. This problem is heavily researched, by casinos in particular, and is not significantly changed by the availability of quantum computers, at least for applications such as keno where the generator can be reseeded before each generation. You could maybe argue that this is an excuse for the IT guy to get his hands on a quantum computer, which fits the stereotypes, but that still breaks the story for reasons that would be spoilers. As soon as any other casino thought about this, they'd laugh in the face of the characters. I don't want to make too much of this, since anyone can write one bad story, but this story was dire at every level. I still owe Weir a proper chance at novel length, but I can't say this added to my enthusiasm. (2) Rating: 4 out of 10

Next.

Previous.